Results 1 to 15 of 15
  1. #1

    googlebot make site overloading. what to do ?

    Durign this last 3-4 months i add ~near 20-30mb content in my database (php-mysql site) and after this start my problems. [just now db size is ~near 60mb]
    every time google visit my site, my apache was overloaded and as result site "in down". at that time i was on VPS hosting (256mb ram).

    at first i didn't understand what's the problem, (didn't see any connection between google's visits and server overloading).

    it's continued 3 month, after i decide to move my host to other hoster and upgrade my hosting account to 512mb. (it was 30-40 days ago)
    btw, server moving wasn't clear and as result site was in down 2-3 days/

    as result of this:
    1. i loss google's traffic, if before google have 15-25k indexed pages from my site, now there are 500-700 pages.
    2. all my KW-s aren't work
    3. if 3-5 months ago google make 2-3gb during his site visits, now maximum 300-340mb

    during this last month google visit my site 5-7 times and 4 times of this he make server overloading.

    i don't know , what do ?

    1. wrote Google and tell that his visits make site overloading?
    now when he don't do deep crawls ????
    The "overloading" form states sending the form would lead to Google coming less to my site.
    That's not what I want at all!

    i think that after this he will visit less , in which i'm not interested.

    2. upgrade my server or move to dedicated with minimum 1 gb ram?
    hmm ... now , durign this 40 days my business was worst (i mean site incoming), so it's a little terrible to pay for dedicated server and don't be sure, that we can earn enough.

    the problem is, that when my problems starts, at first i think that it may be google's ban or penalty (for duplicate content).
    still now i don't sure that the only problem is in server overloading.

    pls, help me,
    suggest, what's the best solution in this case?

    thank you
    Last edited by 3-rx; 05-19-2005 at 03:36 PM.
    Your Health Encyclopedia
    Medical and health consumer information resources containing comprehensive and unbiased information in patient-friendly language

  2. #2
    Join Date
    Mar 2005
    Location
    England
    Posts
    1,201
    Cannot you just put a line in the meta tags that tell bots to stay away?
    My Blog - www.Bakie.net
    My Entertainment site - www.SpudMud.com

  3. #3
    why? to restrict robots from indexing my site?
    Your Health Encyclopedia
    Medical and health consumer information resources containing comprehensive and unbiased information in patient-friendly language

  4. #4
    Join Date
    Feb 2003
    Location
    Connecticut
    Posts
    5,441
    I would look into Caching. If you're not a programmer, find a freelancer and ask about it.

    Basically it takes dynamic pages and makes static copies of it, so Google would only get html pages, not ones that have had to access the database.

  5. #5
    http://www.google.com/webmasters/faq.html#toofast

    It says to contact them. I could've sworn I recently saw a robots.txt that told Googlebot to limit hit speed, but I can't find it now and it doesn't seem to be a robots exclusion standard.

  6. #6
    I could've sworn I recently saw a robots.txt that told Googlebot to limit hit speed, but I can't find it now and it doesn't seem to be a robots exclusion standard.
    do u mean this?

    User-agent: Googlebot
    Crawl-delay: 10

    unfortunately , I don't think googlebot honors a Crawl-delay.
    Useful for MSNbot, though

    http://www.google.com/webmasters/faq.html#toofast
    as i told before, i need deep crawls ... so don't think that to contact G and ask don't crawl deep is => "That's not what I want at all!"

    I would look into Caching.
    hmm .. it's very interesting. i'm not a programmer, but i hear about this.
    can u suggest any programmer who can do it .. or may be give more info about this.

    thx
    Your Health Encyclopedia
    Medical and health consumer information resources containing comprehensive and unbiased information in patient-friendly language

  7. #7
    i add this line in my robots.tx and check via

    www.searchengineworld.com/cgi-bin/robotcheck.cgi and get
    ERROR Invalid fieldname: Crawl-delay: 10

    hmm :=(
    Your Health Encyclopedia
    Medical and health consumer information resources containing comprehensive and unbiased information in patient-friendly language

  8. #8
    Originally posted by 3-rx
    do u mean this?

    User-agent: Googlebot
    Crawl-delay: 10

    unfortunately , I don't think googlebot honors a Crawl-delay.
    Useful for MSNbot, though
    Yeah, I think that was it. And I think it was for MSN and not Google. And it's not a standard (MS not following standards? go figure) so the robots.txt validator would flag it.

    I'm confused about your response to my link. The link is directly to a question about crawling too fast, not too deep. They have a link to a contact page and say they'll work to figure out the problem.

    Good luck...

  9. #9
    Join Date
    Jun 2004
    Location
    St.Petrsburg, Russia
    Posts
    11
    Hello.

    First I would look at are the server logs. You could try to extract information concerning Google visits from them. Maybe it could give you some notion why you have the problem.

    As to the caching: usually caching does not require any programming. Most probably you have this option in your hoster control panel. Of course caching could be disabled in your script by sending the following headers to the browser:

    PHP Code:
    header ("Cache-control: no-cache");
    header ("Pragma: no-cache"); 
    If you do not have any lines like those in the code, most probably you could control caching through your hoster control panel. Try to turn the cache on.

    As to vanishing from the Google index, it could be due to your site unavailability for the Google Bot for 2-3 days. That is when you relocated to the other hoster. If this is the problem, your site will reappear in the Google index in a month or so. Though the reason could be different. So the situation requires some further detailed exploration of course.

    --
    Best Regards,
    Sergey Korolev
    www.SKDevelopment.com

  10. #10
    Join Date
    Jul 2001
    Location
    Canada
    Posts
    1,284
    There are a lot of articles about search bots/spiders getting caught in loops and such in dynamic web sites.

    A search on Google will yeild lots of info.
    "Obsolesence is just a lack of imagination."

  11. #11
    Google has the WebAccelerator now, which is causing problems for our webapps. 37signals has a great article in their blog archive about it.

  12. #12
    As to the caching: usually caching does not require any programming. Most probably you have this option in your hoster control panel. Of course caching could be disabled in your script by sending the following headers to the browser:
    PHP:

    header ("Cache-control: no-cache");
    header ("Pragma: no-cache");

    If you do not have any lines like those in the code, most probably you could control caching through your hoster control panel. Try to turn the cache on.
    Hello Segey

    yes, i have this lines in my files (as meta tags), but i delete this 10 days ago.
    i'm on vps account so i have full control to my server, but unfortunately i don't know how can i "turn the cashe on" through control panel. [cpanel , whm] :-(

    btw i check ur seo tool: => nice tool

    2 BigMoneyJim

    I'm confused about your response to my link. The link is directly to a question about crawling too fast, not too deep. They have a link to a contact page and say they'll work to figure out the problem.
    if u are sure, that this link is for crawling too fast, and don't block deep crawl ... i'll think about it. thank you

    2 tball
    i check 37 signals site, nice site with great PR, but i don't find anything interesting regarding my problem.
    Your Health Encyclopedia
    Medical and health consumer information resources containing comprehensive and unbiased information in patient-friendly language

  13. #13
    Regarding my site cache

    As to vanishing from the Google index, it could be due to your site unavailability for the Google Bot for 2-3 days. That is when you relocated to the other hoster. If this is the problem, your site will reappear in the Google index in a month or so. Though the reason could be different. So the situation requires some further detailed exploration of course.
    its' my site cache results [yourcache.com]

    very interesting information. view results for 20 may and for 17 mAY
    Your Health Encyclopedia
    Medical and health consumer information resources containing comprehensive and unbiased information in patient-friendly language

  14. #14
    Your Health Encyclopedia
    Medical and health consumer information resources containing comprehensive and unbiased information in patient-friendly language

  15. #15
    Join Date
    Jun 2004
    Location
    St.Petrsburg, Russia
    Posts
    11
    I have special Cache Manager in my hoster control panel (not CPanel). It allows me to turn the cache on and off and to change cache expire time for different folders.

    But all it does is adding a file .htaccess to the directory with the following lines:

    ExpiresActive On
    ExpiresDefault "access plus 1 hour"

    I think it is better to ask your hoster technical support, whether they have server cache and allow users to control it.

    --
    Best Regards,
    Sergey Korolev
    www.SKDevelopment.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •