Results 1 to 9 of 9
  1. #1
    Join Date
    May 2002
    Location
    Texas
    Posts
    137

    is there a way to block crawlers and bots?

    hi,
    i'm starting to generate more traffic on one of my sites than i care to and it's bots harvesting names.

    how can i block them?

    the host i'm with uses cpanel 4.something or other.

    thanks!
    john
    "If you come to a fork in the road...take it"

    Yogi Berra

  2. #2
    Join Date
    Jun 2000
    Location
    Washington, USA
    Posts
    5,991
    What kind of bots? The e-mail harvesting kind?

  3. #3
    Join Date
    May 2002
    Location
    Texas
    Posts
    137
    Yes...email harvesting. I've gotten rid of any possible link to an email address on all the domains under my care. I use a php email form instead.

    but even still...i'm seeing at least two sometimes as high as five crawlers a day on my log files.

    i'm hunting for ideas.

    thanks!
    "If you come to a fork in the road...take it"

    Yogi Berra

  4. #4
    Join Date
    Nov 2001
    Location
    Vancouver
    Posts
    2,416
    Use mod_rewrite

    http://www.engelschall.com/pw/apache...teguide/#ToC37

    If its only a couple of bots and you have a consistent HTTP_USER_AGENT tag for them, this will work fine. Redirect them to a spam site for fun.

    Be aware that every test will add to the number of tests done on every hit (not page, hit).
    “Even those who arrange and design shrubberies are under
    considerable economic stress at this period in history.”

  5. #5
    Join Date
    Nov 2001
    Location
    Ann Arbor, MI
    Posts
    2,978
    -Mark Adams
    www.bitserve.com - Secure Michigan web hosting for your business.
    Only host still offering a full money back uptime guarantee and prorated refunds.
    Offering advanced server management and security incident response!

  6. #6
    Join Date
    Aug 2002
    Location
    Baltimore, Maryland
    Posts
    580
    Um if u feel like it just write a quick java script that adds different parts of your email address together after someone clicks on the link.. There was some software that wrote the script for you. forgot what its called though.

  7. #7
    Join Date
    Aug 2002
    Location
    Phoenix, AZ
    Posts
    3
    Well, try a robots.txt (Google for it). I don't know how well those bots will obey it, though.

    Another option is to obfuscate addresses or posion them with a bunch of phony addresses.

  8. #8
    Join Date
    Jul 2001
    Location
    Troy, Missouri USA
    Posts
    1,299

  9. #9
    Join Date
    Aug 2002
    Location
    Baltimore, Maryland
    Posts
    580
    google has something about robots.txt. You can put something it it to tell it not to spider your site.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •