Results 1 to 24 of 24
  1. #1

    Exclamation Advice on robots.txt

    Please give advise about adding a robots.txt file to your website directory and whats the perpose and do you need it?

    Wow im short of breath
    We Bring Good Things to Life!
    www.pctank.co.uk
    - Web Design - Search Engine Optimization - Graphic Design -

  2. #2
    Join Date
    Jan 2008
    Location
    United Kingdom
    Posts
    414
    This gives info on robots.txt:
    http://www.robotstxt.org/robotstxt.html

  3. #3
    I think you need to add it and its really easy.

    Just make a file and name it "robot.txt"

    and put this in it

    User-agent: *
    Disallow:

    This will tell search bot to index all your pages.

  4. #4
    You never have to tell the search engines to index your pages, only what pages to not index..
    Steve
    Metal Monster Marketing : Internet Marketing

  5. #5
    Quote Originally Posted by angilina View Post
    Just make a file and name it "robot.txt"
    #1. It is "robots.txt" NOT "robot.txt".

    Quote Originally Posted by angilina View Post
    User-agent: *
    Disallow:

    This will tell search bot to index all your pages.
    Actually if no param is given, as in your example,after the disallow, it probably will tell the bots to NOT index any of your pages as the default is * if I recall correctly.

    So if this user used your advice, he would have lost any pages in the engines that he had already.

    Please read, learn, and not give advice unless you know what you are talking about.
    William Cross
    Don Halbert *play site*
    william@seofox.com

  6. #6
    Quote Originally Posted by pctank View Post
    Please give advise about adding a robots.txt file to your website directory and whats the perpose and do you need it
    You do not need it unless you actually wish to block the spiders from certain files or directories on your web site. I would suggest uploading an empty robots.txt file to your main html directory anyways, to cut down on the filesize of your apache error log due to it not being found at all.

    And the web site that Sam gave you above is a good place to learn if you actually DO want to block things from the spiders.
    William Cross
    Don Halbert *play site*
    william@seofox.com

  7. #7
    Thanks that is a good idea

    Thanks guys
    We Bring Good Things to Life!
    www.pctank.co.uk
    - Web Design - Search Engine Optimization - Graphic Design -

  8. #8
    Join Date
    Sep 2004
    Location
    Chennai , India
    Posts
    4,632
    Here is a article

    http://www.seopapers.com/article/357

    Well the author is none other than me, self promotion. LOL

  9. #9
    Join Date
    Mar 2008
    Location
    Silicon Valley
    Posts
    5
    Hi Isak,

    It's best used to prevent search engines from indexing pages and directories that you don't want to be displayed to the public. It's not just Google and Yahoo! There are free/paid search engine services that anyone can use to search through your sites for information. PicoSearch is a quick example.

    Another reason would be to keep your site from archived by the Internet Archive (or the Internet Time Machine). Not too many people know this but it can be bad for publicity if you have a unsuitable page that you cannot removed from the Internet.

  10. #10
    Join Date
    Mar 2004
    Location
    Pakistan
    Posts
    2,752
    Okay, the purpose behind the robot file is to guide the robot. If you let it like that it will index every page unless you stop him from doing it do or stop him from indexing a specific link.

    Other than that, people use it for sitemaps as well. If you don't have robot.txt; you don't need to worry if Google is going to index you or not. The Algo is different now.
    I'm Zafar Ahmed.
    I provide
    SEO Services & eMarketing consultancy
    I'll be glad to hear from you

  11. #11
    Quote Originally Posted by nuclei View Post
    You do not need it unless you actually wish to block the spiders from certain files or directories on your web site. I would suggest uploading an empty robots.txt file to your main html directory anyways, to cut down on the filesize of your apache error log due to it not being found at all.

    And the web site that Sam gave you above is a good place to learn if you actually DO want to block things from the spiders.
    Ok shall i apply a empty robot.txt file into my directory just for the hell of it????
    We Bring Good Things to Life!
    www.pctank.co.uk
    - Web Design - Search Engine Optimization - Graphic Design -

  12. #12
    You can.. It won't matter one way or the other.. I do that with some sites just to stop the 404 error reports on that domain..
    Steve
    Metal Monster Marketing : Internet Marketing

  13. #13
    Quote Originally Posted by nuclei View Post
    #1. It is "robots.txt" NOT "robot.txt".



    Actually if no param is given, as in your example,after the disallow, it probably will tell the bots to NOT index any of your pages as the default is * if I recall correctly.

    So if this user used your advice, he would have lost any pages in the engines that he had already.

    Please read, learn, and not give advice unless you know what you are talking about.
    Take a look at this page and try to learn

    robotstxt.org/robotstxt.html

    there is a difference between to codes

    To exclude all robots from the entire server

    User-agent: *
    Disallow: /


    To allow all robots complete access

    User-agent: *
    Disallow:


    May be you forgot to wear your glasses

  14. #14
    Join Date
    Sep 2004
    Location
    Chennai , India
    Posts
    4,632
    Robots.txt is neccessary if you are looking to block your seayorch engine. Its the way you guide your bots when entering into your websites.

  15. #15
    Join Date
    Mar 2004
    Location
    Pakistan
    Posts
    2,752
    Quote Originally Posted by Biju View Post
    Robots.txt is neccessary if you are looking to block your seayorch engine. Its the way you guide your bots when entering into your websites.
    Biju - do you guys call "search" "seayorch" down there in India?
    I'm Zafar Ahmed.
    I provide
    SEO Services & eMarketing consultancy
    I'll be glad to hear from you

  16. #16
    Join Date
    Sep 2004
    Location
    Chennai , India
    Posts
    4,632
    Quote Originally Posted by Zafar Ahmed View Post
    Biju - do you guys call "search" "seayorch" down there in India?
    Zafar, typing mistake friend anyway what ever we call the same is proceed by our brothers in pakistan. Hope ynger brother follows the big one.

  17. #17
    Join Date
    Mar 2004
    Location
    Pakistan
    Posts
    2,752
    Quote Originally Posted by Biju View Post
    Zafar, typing mistake friend anyway what ever we call the same is proceed by our brothers in pakistan. Hope ynger brother follows the big one.
    I really don't know what you mean - have you been drinking lately?
    I'm Zafar Ahmed.
    I provide
    SEO Services & eMarketing consultancy
    I'll be glad to hear from you

  18. #18
    Join Date
    Mar 2008
    Location
    Slough England
    Posts
    6
    See about robotstxt.org

    To exclude all robots from the entire server

    User-agent: *
    Disallow: /

    See robotstxt.html

    To allow all robots complete access

    User-agent: *
    Disallow:

    Thats it

  19. #19
    Join Date
    Sep 2004
    Location
    Chennai , India
    Posts
    4,632
    Quote Originally Posted by Zafar Ahmed View Post
    I really don't know what you mean - have you been drinking lately?

    Well friends drinking is my profession as i can't live a life without it. lol , Read it again you will understand.

  20. #20
    Quote Originally Posted by angilina View Post
    Take a look at this page and try to learn

    May be you forgot to wear your glasses
    Not at all. I just have a longer memory of search engines and anything concerning them than you have been out of diapers. My recollection was indeed outdated however, and I stated "as I recall" in my original post in that event. That recollection was based on when Yahoo started managing wildcards in the robots.txt file.

    The fact remains that the ONLY good reason for even having a robots.txt file UNLESS you are trying to block something is to stop your error_log filesize from growing.

    You never need to TELL the search engines to index every page it can find on your website. That is done by default in every engine. You only need to tell them when to NOT index them.
    William Cross
    Don Halbert *play site*
    william@seofox.com

  21. #21
    Here one more short description robot.txt

  22. #22
    Join Date
    Nov 2002
    Location
    USA
    Posts
    211

    Robot Txt files are tricky

    You really have to know how to use this file correctly and if you just state Disallow and don't state the robots name it won't work. I know as I use tools to see what the robot will do and I have seen them go right to the page that is stated not to go to.

    Here is the wrong way to tell robot not to index a dir:

    User-agent: *
    Disallow: /dir name/

    Here is the correct way to disallow a certain robot:

    User-agent: Googlebot
    Disallow: /dir name/

    For some reason the first one always let Googlebot in and when doing the second it did not. Or any of the other robots.

    Hope this helps.
    Cpwebhosting.net where the customer is first
    Great Affiliate Program Call 321-205-9003
    Earn on CPA up to $100.00 on Shared Hosting

  23. #23
    Join Date
    Mar 2008
    Location
    SEO cyberspace
    Posts
    423
    Quote Originally Posted by rcrrich View Post
    You really have to know how to use this file correctly and if you just state Disallow and don't state the robots name it won't work. I know as I use tools to see what the robot will do and I have seen them go right to the page that is stated not to go to.

    Here is the wrong way to tell robot not to index a dir:

    User-agent: *
    Disallow: /dir name/

    Here is the correct way to disallow a certain robot:

    User-agent: Googlebot
    Disallow: /dir name/

    For some reason the first one always let Googlebot in and when doing the second it did not. Or any of the other robots.

    Hope this helps.
    This is pretty strange all right since all bots and especially googlebot are supposed to understand the wildcard. Its even stranger that specifying Googlebot would keep out all the other bots.

    Are you really sure of this information? My experience has been exactly the opposite.
    I plan to live forever - so far so good
    Expert SEO |Sash Windows London

  24. #24
    Join Date
    Nov 2002
    Location
    USA
    Posts
    211

    Not what i was saying

    Quote Originally Posted by Melnel View Post
    This is pretty strange all right since all bots and especially googlebot are supposed to understand the wildcard. Its even stranger that specifying Googlebot would keep out all the other bots.

    Are you really sure of this information? My experience has been exactly the opposite.

    Not what I was saying that was just one example of just one robot
    I tried this out as I used the tools in google itself and when you don't put the robot like i showed it still was allowed. So do as you wish just saying from my experience. Try it in google tools you will see. I thought as you did.
    Cpwebhosting.net where the customer is first
    Great Affiliate Program Call 321-205-9003
    Earn on CPA up to $100.00 on Shared Hosting

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •