Web Hosting Talk







View Full Version : Linkwalker - robots.txt


saghir69
08-20-2004, 09:57 AM
hi linkwalk seem like some link checking service that is using a lot of bandwith for no reason(i did not join the service)
so i want to dissallow this agent in ma robots.txt file!

so is this file below writen right? will it only disallow linkwalker from accessing the whole site and the other agents will be follow the rule for *?

User-agent: *
Disallow: /forum/admin/
Disallow: /forum/db/
Disallow: /forum/images/
Disallow: /forum/includes/
Disallow: /forum/language/
Disallow: /forum/templates/
Disallow: forum/common.php
Disallow: forum/groupcp.php
Disallow: forum/memberlist.php
Disallow: forum/modcp.php
Disallow: forum/posting.php
Disallow: forum/profile.php
Disallow: forum/privmsg.php
Disallow: forum/viewonline.php
Disallow: forum/faq.php
Disallow: forum/updates-topic.html*$
Disallow: forum/stop-updates-topic.html*$
Disallow: forum/ptopic*.html$
Disallow: forum/ntopic*.html$
Disallow: forum/post-*.html$
Disallow: forum/updates-topic.html*$
Disallow: forum/stop-updates-topic.html*$
Disallow: forum/ptopic*.html$
Disallow: forum/ntopic*.html$
User-agent: LinkWalker
Disallow: /

kenmitch
08-21-2004, 01:50 AM
hey I'm just a newbie :) Just started looking into this about half hour ago after I viewed my server logs.

But wouldn't it be

User-agent: linkwalker
Disallow: /

Not 100% sure but that should wack linkwalker only out of the picture

User-agent: *
Disallow: /forum

Pretty sure would wack all out of the forum folder

You could always go CLICK HERE FOR INFO (http://www.searchengineworld.com/robots/robots_tutorial.htm)


Thanks,
Kenmitch

JayC
08-23-2004, 05:01 PM
Originally posted by kenmitch
But wouldn't it be

User-agent: linkwalker
Disallow: /

Not 100% sure but that should wack linkwalker only out of the picture The way saghir69 had it, "LinkWalker," shouldn't cause any problems. According to the Robots Exclusion Standard, the robot should do a "case insensitive substring match of the name without version information..." Linkwalker supposedly (according to the site of the company that runs it) follows the standard, so as long as they're being accurate in saying that, case shouldn't matter.

User-agent: *
Disallow: /forum

Pretty sure would wack all out of the forum folder Yep. But I think that's what he's intending to do: exclude spiders in general from several folders, and exclude linkwalker completely from the entire site.