Web Hosting Talk







View Full Version : robots.txt help


grabmail
08-03-2006, 07:17 PM
When i set

Disallow /secret/

Am i asking the bot to not index www.domain.com/secret/ (http://www.domain.com/secret/)*

or

AM i asking the bot to not index any files in the physical folder ./secret

The reason i'm asking this is because i use an MVC framework and www.domain.com/secret (http://www.domain.com/secret) will not bring you to a physical folder ./secret

01globalnet
08-03-2006, 07:33 PM
For the bot as for every visitor, it is going to index yourdomain.com/secret as a request from the webserver (virtual directory) and ignoring the physical path (this is how a web server works, for example if your webserver is Apache and type www.yourdomain.com/manual/ then it is going to show the manual of apache even though you do not have any physical folder named manual)

And do not forget to password protect this folder (assuming your mvc framework is not doing direct GET request but includes only) - the bots will not index this folder but curious eyes may try to see what's there.

Btw, which framework are you using?

grabmail
08-04-2006, 06:04 AM
oh. cakephp

NyteOwl
08-04-2006, 03:01 PM
Note also that not all bots obey robots.txt. The best way to keep the directory /secret a secret is to make sure indexes are set in htaccess and not to link to it from any pages. Even that isn't 100% certain but short of password protecting it it is as close as you'll get.

sea otter
08-04-2006, 05:22 PM
Lots of interesting reading on robots here:

http://www.robotstxt.org/wc/robots.html