is there a web crawler thing (other than google) that caches myspace profiles
is there a web crawler thing (other than google) that caches myspace profiles
Myspace uses robot.txt to tell automated bots not to crawl its site.
[url]http://www.myspace.com/robots.txt[/url]
However, search engines still caches some myspace pages.. And as google probably has the biggest cache of myspace pages then i only s***est using google.. (or a simular search engine; such as msn)
Archive.org could work (i haven't checked), but as myspace uses a robot.txt it's highly unlikely to work...
Sorry i couldn't help much..
The file Troll mentioned (robots.txt) is a way of preventing bots crawling your website. You can disallow all crawlers, or some with a specific user agent.
On Myspace's robots.txt, they block ia_archiver. I just Googled, and it's the user agent string of [url]www.archive.org[/url]. I was going to s***est that as a place to check, but they're blocked from caching Myspace pages.
Other than that, try Coral:
[url]http://www.coralcdn.org/[/url]
... or, just search Google for "search engine" and try the cached versions on all the search engines you can find. Yahoo, Live Search, et al.