Go Back   All Net Tools - Forum > Main > Internet Privacy
Register FAQ Members List Social Groups Calendar Search Today's Posts Mark Forums Read

Reply
 
Thread Tools Display Modes
  #1  
Old 04-28-2007, 04:11 PM
Beyerstein00 Beyerstein00 is offline
Registered User
 
Join Date: Apr 2007
Posts: 1
Arrow myspace question

is there a web crawler thing (other than google) that caches myspace profiles
Reply With Quote
  #2  
Old 04-28-2007, 05:14 PM
Troll Troll is offline
Registered User
 
Join Date: Nov 2006
Location: East of Happy Nonsense
Posts: 178
Myspace uses robot.txt to tell automated bots not to crawl its site.

http://www.myspace.com/robots.txt

However, search engines still caches some myspace pages.. And as google probably has the biggest cache of myspace pages then i only suggest using google.. (or a simular search engine; such as msn)

Archive.org could work (i haven't checked), but as myspace uses a robot.txt it's highly unlikely to work...

Sorry i couldn't help much..
Reply With Quote
  #3  
Old 04-28-2007, 05:31 PM
Ezekiel's Avatar
Ezekiel Ezekiel is offline
Moderator
 
Join Date: Sep 2005
Location: UK
Posts: 2,071
The file Troll mentioned (robots.txt) is a way of preventing bots crawling your website. You can disallow all crawlers, or some with a specific user agent.

On Myspace's robots.txt, they block ia_archiver. I just Googled, and it's the user agent string of www.archive.org. I was going to suggest that as a place to check, but they're blocked from caching Myspace pages.

Other than that, try Coral:

http://www.coralcdn.org/

... or, just search Google for "search engine" and try the cached versions on all the search engines you can find. Yahoo, Live Search, et al.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump

Powered by vBulletin®
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.