hosted by liquidweb


Go Back   Web Hosting Talk : Web Hosting Main Forums : Web Hosting : Ok can someone please tell me who this crawl bot belongs to?
Reply

Web Hosting Discussions on all aspects of web hosting including past experiences (both negative and positive), choosing a host, questions and answers, and other related subjects. If your service is unavailable, please click here.
Forum Jump

Ok can someone please tell me who this crawl bot belongs to?

Reply Post New Thread In Web Hosting Subscription
 
Send news tip View All Posts Thread Tools Search this Thread Display Modes
  #1  
Old 03-17-2003, 12:04 PM
thomor25 thomor25 is offline
Web Hosting Master
 
Join Date: Aug 2002
Posts: 727

Ok can someone please tell me who this crawl bot belongs to?


I think it might be google but I'm not sure, its the top one.

This is from awstatsm first is teh name of the bot, second is the times accessing the site, 3rd is bandwidth used, and 4th is the date accessed.


Unknown robot (identified by 'crawl') 255 1.44 MB 15 Mar 2003 - 04:36
Googlebot (Google) 188 3.00 MB 16 Mar 2003 - 10:43
WISENutbot (Looksmart) 139 2.22 MB 17 Mar 2003 - 06:47
Inktomi Slurp 43 574.99 KB 17 Mar 2003 - 05:39
Unknown robot (identified by 'robot') 2 45.00 KB 06 Mar 2003 - 03:09
Netcraft Web Server Survey 1 0 Bytes 14 Mar 2003 - 23:22

__________________
www.betopdollarcom - Be Top Dollar - Are you willing to pay just $1 more to Be Top Dollar?

Reply With Quote


Sponsored Links
  #2  
Old 03-17-2003, 12:09 PM
UH-Matt UH-Matt is offline
Corporate Member
 
Join Date: Aug 2002
Location: London, UK
Posts: 9,027
Well the googlebot one is google duh!

The Unknown one could be from many different sources. Theres too many bots around these days.

__________________
Matt Wallis
United Communications Limited
High Performance Shared & Reseller | Managed VPS Cloud | Managed Dedicated
UK www.unitedhosting.co.uk | US www.unitedhosting.com | Since 1998.

Reply With Quote
  #3  
Old 03-17-2003, 12:18 PM
thomor25 thomor25 is offline
Web Hosting Master
 
Join Date: Aug 2002
Posts: 727
Quote:
Originally posted by UH-Matt
Well the googlebot one is google duh!

The Unknown one could be from many different sources. Theres too many bots around these days.
well yeah duh .... but i heard alot of people here talking about the google bot doing a "deep crawl" and I wondered if the "crawl" meant that it was google cause it had been accessing the site alot.

__________________
www.betopdollarcom - Be Top Dollar - Are you willing to pay just $1 more to Be Top Dollar?

Reply With Quote
Sponsored Links
  #4  
Old 03-17-2003, 12:29 PM
eddy2099 eddy2099 is offline
Web Hosting Master
 
Join Date: May 2001
Posts: 8,070
Probably it is some other web crawlers. I've seen it in my awstats but do not know where it is from. I have not check the raw logs, perhaps it would provide a little more info such as IP address ? With which, you could do a trace back ?

Reply With Quote
  #5  
Old 03-17-2003, 12:33 PM
sprintserve sprintserve is offline
Retired Moderator
 
Join Date: Jan 2003
Posts: 9,000
If it is from Google, it will be identified as Googlebot. so the unknown bot... is... well unknown. If you really want to, perhaps you can find out the ip, and check it out on arin.

__________________
••• 100% Customer Satisfaction!!! •••
••• http://www.sprintserve.net •••
••• Offering: | Internap FCP Bandwidth! | Rebootless Kernel Updates! | Magento Optimized Hosting | •••
••• Services: | Managed Multiple Cores 64bit Servers | Server Management | •••

Reply With Quote
  #6  
Old 03-17-2003, 03:24 PM
JayC JayC is offline
Web Hosting Master
 
Join Date: Aug 2000
Location: NYC
Posts: 6,627
Yep, anything from Google is identified... as is any crawler from a legitimate search engine. But there are all kinds of crawls done for any number of purposes. If you want to try to figure out who they are, the raw logs would be the way to identify who the IP address belongs to. Still that may won't tell you much.

Those two "unknown" listings, by the way, actually could be more than one crawler lumped together, I'd guess (I don't use awstats). It looks like anything that identifies itself as 'crawl' is in the first one and anything that identifies itself as 'spider' is in that one.

__________________
Specializing in SEO and PPC management.

Reply With Quote
  #7  
Old 03-17-2003, 03:27 PM
thomor25 thomor25 is offline
Web Hosting Master
 
Join Date: Aug 2002
Posts: 727
Quote:
Originally posted by JayC
Yep, anything from Google is identified... as is any crawler from a legitimate search engine. But there are all kinds of crawls done for any number of purposes. If you want to try to figure out who they are, the raw logs would be the way to identify who the IP address belongs to. Still that may won't tell you much.

Those two "unknown" listings, by the way, actually could be more than one crawler lumped together, I'd guess (I don't use awstats). It looks like anything that identifies itself as 'crawl' is in the first one and anything that identifies itself as 'spider' is in that one.
well the only ips that accessed my site at that time was from www.ripe.net

__________________
www.betopdollarcom - Be Top Dollar - Are you willing to pay just $1 more to Be Top Dollar?

Reply With Quote
  #8  
Old 03-17-2003, 03:29 PM
sprintserve sprintserve is offline
Retired Moderator
 
Join Date: Jan 2003
Posts: 9,000
That means... that the bots are from Europe (thus Arin will report the ips as belonging to Ripe). Go to Ripe and try the same IPs again.

__________________
••• 100% Customer Satisfaction!!! •••
••• http://www.sprintserve.net •••
••• Offering: | Internap FCP Bandwidth! | Rebootless Kernel Updates! | Magento Optimized Hosting | •••
••• Services: | Managed Multiple Cores 64bit Servers | Server Management | •••

Reply With Quote
  #9  
Old 03-17-2003, 03:43 PM
thomor25 thomor25 is offline
Web Hosting Master
 
Join Date: Aug 2002
Posts: 727
did what you said and it gave me this

inetnum: 80.8.54.0 - 80.8.72.255
netname: FR-FT-WIC
descr: France Telecom Wanadoo Interactive Cable
descr: bas-1.sqy.net
country: FR
admin-c: WICT1-RIPE
tech-c: WICT1-RIPE
status: ASSIGNED PA
remarks: for hacking, spamming or security problems send ALSO mail to
remarks: abuse@cablewanadoo.com
remarks: for ANY problem send mail to gestionip.ft@francetelecom.com
notify: gestionip.ft@francetelecom.com
mnt-by: FT-BRX
changed: gestionip.ft@francetelecom.com 20011002
source: RIPE

route: 80.8.0.0/16
descr: France Telecom
descr: Wanadoo Interactive Cable
remarks: -------------------------------------------
remarks: For Hacking, Spamming or Security problems
remarks: send mail to abuse@cablewanadoo.com ONLY
remarks: -------------------------------------------
origin: AS3215
mnt-by: RAIN-TRANSPAC
mnt-by: FT-BRX
changed: karim@rain.fr 20010612
changed: karim@rain.fr 20020130
changed: gestionip.ft@francetelecom.com 20020909
source: RIPE

role: Wanadoo Interactive Cable Technical Role
address: France Telecom Wanadoo Interactive Cable
address: 40, rue Gabriel Criι
address: 92240 Malakoff
address: FR
phone: +33 1 58 88 54 16
e-mail: abuse@cablewanadoo.com
admin-c: MM2888-RIPE
tech-c: ML16648-RIPE
nic-hdl: WICT1-RIPE
mnt-by: FT-BRX
changed: gestionip.ft@francetelecom.com 20010517
changed: gestionip.ft@francetelecom.com 20020531
source: RIPE

__________________
www.betopdollarcom - Be Top Dollar - Are you willing to pay just $1 more to Be Top Dollar?

Reply With Quote
  #10  
Old 03-17-2003, 03:48 PM
sprintserve sprintserve is offline
Retired Moderator
 
Join Date: Jan 2003
Posts: 9,000
Well.. Wanadoo is an ISP in France... So whoever running that bot is running it from their home cable, Unless the bot did something illegal, that's as far as you can go We can only speculate why they are crawling your site.

__________________
••• 100% Customer Satisfaction!!! •••
••• http://www.sprintserve.net •••
••• Offering: | Internap FCP Bandwidth! | Rebootless Kernel Updates! | Magento Optimized Hosting | •••
••• Services: | Managed Multiple Cores 64bit Servers | Server Management | •••

Reply With Quote
Reply

Related posts from TheWhir.com
Title Type Date Posted
Will Backpage be the death of the Communications Decency Act? Blog 2013-02-05 18:52:05
Hackers Post 450,000 Yahoo! Voices User Login Credentials Online Web Hosting News 2012-07-12 10:38:07
WHD 2012 - Intel Xeon E5 Chips Deliver Efficiency for Web Hosting Environments Web Hosting News 2012-03-21 10:03:28
Web Host Intermedia Hits 400K Premium Exchange Mailbox Milestone Web Hosting News 2011-12-08 16:29:53
Security Researchers Detect New Stuxnet-Like Threat Web Hosting News 2011-10-19 19:45:57


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes
Postbit Selector

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump
Login:
Log in with your username and password
Username:
Password:



Forgot Password?
Advertisement:
Web Hosting News:



 

X

Welcome to WebHostingTalk.com

Create your username to jump into the discussion!

WebHostingTalk.com is the largest, most influentual web hosting community on the Internet. Join us by filling in the form below.


(4 digit year)

Already a member?