Web Hosting Talk







View Full Version : what is the best perl module to make spiders ?


jjk2
07-14-2008, 03:21 PM
what is the best module to make spiders in cpan ?

also, where do i specify the location of robot.txt ?

most sites do not have it ?

for ex) http://www.digg.com/robot.txt do not exist.....

maybe im doing it wrong.


also, if they no robot.txt, that means i can spider ?

thanks! also can someone suggest a book on spidering specific with perl ?

cygnusd
07-18-2008, 12:13 AM
digg.com's robots file exists at: http://digg.com/robots.txt

WWW::Mechanize should help you automate spidering a site. It is a subclass of LWP therefore you can customize handling of 'robots.txt'

Try searching more info at perlmonks.org