Results 1 to 22 of 22
  1. #1

    What To Do When Data Center has Issues?

    You can have RAID, redundant PSUs, a stash of RAM on standby ... but what do you do when the data center has network problems?

    Tonight my site goes offline for 2hrs (and counting ...) A network fault is identified and a fix is in place. There has been small outages in the past but nothing as long as this.

    So what can be done as a fallback? I don't mean for you high rollers with replicated servers all over the place. What about us small time sites?

    Another server in a different DC is one solution. But how do you switch over DNS settings so that visitors connect with the back up server instantly?

  2. #2
    Join Date
    Nov 2005
    Posts
    3,944
    The easiest solution is to run an external DNS server with a low TTL and update the records upon failure. However, this will not work 100% because current/recent visitors may have DNS information cached.


  3. #3
    You can try Distributed domain name server service and CDN-s
    CLOUDYHOST | UNIQUE: ⚡ Super Fast NGINX Hosting | WordPress Manager | Auto-Updater | Headless Mode | Elementor Pro & DIVI Pro Included

  4. #4
    OK, will take a look at those.

    ---

    Site was down for nearly 4 hours in total. In 14 years of running a site (dedicated and co-location) I have never experienced downtime like this. It's unacceptable.

  5. #5
    The best solution is first - to outsource the DNS and second - to have a 2nd server on a different DC for failover.

    Some DNS providers offer TTL as low as 1 or 5 minutes, so propagating the "new" records will be quite fast.

    The 2nd solution is as @webhostal suggested - using a CDN to serve static content of your web sites (but won't work with dynamic content) - something similar to CF's "always on".

  6. #6
    Join Date
    Feb 2011
    Posts
    607
    Use datacenters with redundant routers and redundant networks and have redundant paths from your network to those datacenter routers via redundant NICs and switches. Or run your own network, but that would cost much more.

  7. #7
    I thought the datacentre I had was reliable; did have all that resilience. It seems that last night this happened:

    "The underlying root cause of tonight's issues has been identified as a fibre break. This caused excessive flapping on the E1 transit ring which resulted in several of the ring switches rebooting. Unfortunately one of the switches did not recover - this then led to severe network instability which proved difficult to pinpoint."
    ---

    Why isn't TTL set to 1 minute for all sites already? Excessive lookups?

  8. #8
    Join Date
    Feb 2011
    Posts
    607
    Just because there is some redundancy does not mean that it is good enough. Single fibre break should not lead to any issues in a properly designed system. Neither it should lead to any issues in a properly tested system.

  9. #9
    Join Date
    Apr 2009
    Location
    Toronto
    Posts
    251
    I find a lot of customers will have a cloud instance ready for fail-over. Low TTL for manual switch over or intelligent DNS load balancing (detects when server 1 goes dead, directs to server2/cloud). This can be done fairly cheap depending on what the website is doing.

    Its always good to have a fail over plan in place. We've always found its not if your data center will go down, but when and being prepared for that. It might be perfect for 1 year, 5 years or more, but sooner or later something will occur even in the best setups.

  10. #10
    Join Date
    Aug 2007
    Location
    Datacenter
    Posts
    4,414
    You can also have a talk to the datacenter to ask why on earth they don't have a redundant setup.
    This doesn't cost you a thing

    Unless they are victim of cyber crime which is off course another story.
    » www.InstantDedicated.com - Online in no time
    » Dedicated Servers in [EU] Netherlands + Belgium with DAILY support, also on weekends
    » 3.2 Tbit/s Network AS49453 with only 100 Gbit/s uplink backbone
    » 1G/10G/40G/100 Gbit ports available | 99,99% Network Uptime goal

  11. #11
    Quote Originally Posted by Larry David View Post
    I thought the datacentre I had was reliable; did have all that resilience. It seems that last night this happened:

    "The underlying root cause of tonight's issues has been identified as a fibre break. This caused excessive flapping on the E1 transit ring which resulted in several of the ring switches rebooting. Unfortunately one of the switches did not recover - this then led to severe network instability which proved difficult to pinpoint."
    ---

    Why isn't TTL set to 1 minute for all sites already? Excessive lookups?

    I know which data-centre that is, we've got a few servers there too and had quite a bit of downtime the other day. Was unfortunate, and I was surprised a single fibre break would cause that chaos. They're generally very reliable though (in my experience).

  12. #12
    Join Date
    Oct 2009
    Posts
    590
    You cannot rely on anything any provider says about their data center unless you see it for yourself and you know what questions to ask.

    No matter how redundant they say they are very few truly are. I am saying this from experience with a lot of them. The economics of being truly redundant make it almost impossible to compete so they almost never are. Yes even some of the supposedly best of the best who charge top dollar. Also, some truly redundant data centers have still gone down for other reasons. Like when Hurricane Sandy flooded Manhattan.

    As someone else said, always assume that it's not a matter of if but when a data center goes down. You should do everything you can to make things data center redundant. DNS load-balancing/failover is probably the easiest and most economical.
    Last edited by UnfinishedSentenc; 10-13-2013 at 11:42 AM.

  13. #13
    econdc. Do you think it is to do with the recent take-over of PH by P?

  14. #14
    Join Date
    Oct 2012
    Location
    Arlington, VA
    Posts
    332
    What about a failover cross-connect to another provider in the same facility?
    InfoRelay Online Systems, Inc.
    Colocation | Cloud | Data Centers | Bandwidth | Managed Services
    New York / Northern Virginia / Chicago / Dallas / Miami / Los Angeles / San Jose
    www.InfoRelay.com/Services/Data-Centers

  15. #15
    Join Date
    Feb 2005
    Location
    localhost
    Posts
    5,473
    A provider with multiple locations with a failover option might be a good idea. cloud service??
    Respectfully,
    Mr. Terrence

  16. #16
    Join Date
    Jan 2006
    Posts
    72
    Many large ISP's have a minimum threshold for TTL's (measured in hours), so setting yours to 1 minute would probably be futile depending upon your customer base.

    Another possible solution would be load balancing, though at a higer price depending on the amount of bandwidth your using. rackspace has load balancing services (you don't have to be hosting with them), but if your pushing many TB's of data that's probably going to be more than you're looking to spend. Some smaller hosting companies might provide this service if you are hosting in 2 of their sites. If your purchasing bandwidth from them your not paying for it twice as you would be with rackspace. The company you're with now may be able to work with you on such a solution, or possibly recommend somebody who has done this for their customers already.
    Last edited by MetroAce; 10-14-2013 at 06:21 PM.

  17. #17
    Join Date
    Nov 2012
    Location
    Los Angeles, CA
    Posts
    498
    Look for another data center that has the security and reliability to handle all your needs.
    ¦¦ COLOCATIONAMERICA.COMIt’s All About Connections...
    ¦¦ Colocation, Dedicated Servers, Premium Bandwidth, 24/7 Support!
    ¦¦ Los Angeles , New York , New Jersey , San Francisco , Chicago
    ¦¦ Email: Sales@ColocationAmerica.com for a quote or call 888-505-COLO!

  18. #18
    Join Date
    Oct 2012
    Location
    Arlington, VA
    Posts
    332
    If it were that simple, we'd all use that one "data center that has the security and reliability". The bottom line is that all data centers have issues now and then - the only variable is how the service team handles them. Do they ignore tickets and fight back on WHT, like some providers we've seen on here? Or do they suck it up and accept that it was their own fault and deal with it accordingly?
    InfoRelay Online Systems, Inc.
    Colocation | Cloud | Data Centers | Bandwidth | Managed Services
    New York / Northern Virginia / Chicago / Dallas / Miami / Los Angeles / San Jose
    www.InfoRelay.com/Services/Data-Centers

  19. #19
    I am putting a new server in a different DC with the old server staying where it is. I now just need to work out a good strategy to use if the new DC is unavailable for some period of time.

    An hour or two of downtime would just be about acceptable (as long as it is a rare occassion). Anything longer than that and I would like the site users to be pointed to the old backup server.

    Updating the DNS record is fine but there is the problem of waiting for it to ripple through to users.

    I am not bothered about new visitors or search engine bots not picking up that the backup server is in use but I do need to ensure that my regular customers can access their data. Thus I would need them to access the back server ASAP.

    One way to do that I guess is for them to modify their hosts file. If they add an entry for

    example.com 222.222.222.222 (the backup server)

    then issue

    ipconfig /flushdns

    and restart their browser they should instantly connect to the backup server, correct?

  20. #20
    Join Date
    Jan 2010
    Posts
    308
    Quote Originally Posted by Larry David View Post
    example.com 222.222.222.222 (the backup server)

    then issue

    ipconfig /flushdns

    and restart their browser they should instantly connect to the backup server, correct?
    /flushdns doesn't control the cache of their upstream DNS resolver. That just flushes the host's DNS resolver cache. Setting a low'ish TTL usually works well enough. From what we've seen after changing DNS records, the amount of people still hitting the old IP immediately after TTL expiring is very low. Usually in the 1-3% range.

    Keeping a low TTL works a lot better than the fear mongerers would suggest. Are there some ISP's that pin cache TTL's to their own liking? Sure. Do they have a ton of eyeballs? No.

  21. #21
    Join Date
    Aug 2010
    Posts
    1,976
    what datacenter is this?
    Superb Houston/Los Angeles Colocation: LAYERHOST.COM https://www.layerhost.com/colocation
    *not affiliated, just recommendation*

  22. #22
    Join Date
    Jan 2006
    Posts
    35
    Amazon's Route 53 DNS service also offers DNS failover with low TTL. I haven't used it myself, but reading the documentation this may help you with your setup Larry and is quite low cost (it seems).

Similar Threads

  1. Any one have issues with Softlayer Netherland Data center ?
    By public_html in forum Dedicated Server
    Replies: 8
    Last Post: 05-16-2012, 02:56 PM
  2. Europe - Romania Data Center - Dedicated Servers at Ch-center.com
    By ClaudiuPopescu in forum Dedicated Hosting Offers
    Replies: 2
    Last Post: 03-04-2011, 05:49 AM
  3. Data Center Issues?
    By crockett5 in forum Web Hosting
    Replies: 3
    Last Post: 01-27-2005, 11:57 PM
  4. Pegasus Data Center: Network Issues
    By JenniH in forum Providers and Network Outages and Updates
    Replies: 2
    Last Post: 01-18-2005, 06:41 PM
  5. Replies: 16
    Last Post: 02-08-2003, 11:37 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •