Results 1 to 22 of 22
  1. #1

    Unhappy Hardware failure... again

    Hi,
    I just went to update my site... and found PHP errors all over the place. I logged into my sites support panel and found that my server had failed... again. I am on shared webhosting.
    The company says that their Dell server had a manufactoring defect. This is the second server I have had die. The server before it was a "brand new dell poweredge" (their words), and now I'm on another "brand new dell poweredge" that shouldn't fail.
    Should I be concerned about this? How many servers have other hosters had fail?
    Thanks,

  2. #2
    Join Date
    Jun 2003
    Location
    UK
    Posts
    6,616
    Well hardware does fail. It could be they are unlucky. What is the time period between them?

    Rus
    Russ Foster - Industry Curmudgeon
    Freelance Sysadmin for Hire - email vaserv@gmail.com

  3. #3
    Join Date
    Nov 2004
    Location
    Wisconsin
    Posts
    148
    While hardware can fail, it doesn't happen too often. Like vaserv asked, what is the time frame between the two failures. If it is something like a few weeks then I would be concerned. Also did they say exactly what failed in the servers? Hard drives and generic power supplies are the most common thing to fail in a server (at least in my experience as a computer technician).
    ~ Nick

  4. #4
    It's been about a month and a half. The first server failed before I had even moved a site on it (maybe a week after I registered). I have also been hacked once on their servers.
    I never had a problem with my old host... no hardware failures, no hacking, nothing.
    But for what I was paying at my old host I could've had a VPS...

  5. #5
    Join Date
    Jun 2003
    Location
    UK
    Posts
    6,616
    Taking what you are saying I would say prehaps you should look to pay a bit more and fine a new host...Just the way I'm seeing is that you are getting what you pay for

    Rus
    Russ Foster - Industry Curmudgeon
    Freelance Sysadmin for Hire - email vaserv@gmail.com

  6. #6
    Join Date
    Aug 2003
    Location
    Vancouver, BC
    Posts
    1,894
    That certainly seems like a string of bad luck. Not sure why it happened but it is possible to happen to anyone.
    Gary Jones

    BlueFur.com - Canada Web Hosting

  7. #7
    Join Date
    Nov 2004
    Location
    Wisconsin
    Posts
    148
    I would definatly switch to a different host. There are very few reasons that two servers should fail within 2 months. When you say that you got hacked, that isn't necessarily the companies fault as they could harden their servers as much as possible, but if you are running an insecure script, you can still get hacked.
    ~ Nick

  8. #8
    A message from support when I asked:

    We're experiencing the File System of the primary master HDD.I don't know what was the problem a month ago, I'm still new here.
    What does that mean? How can you experience the file system... someone please clarify.
    On the plus side, I got a response within 5 minutes, thumbs up for their tech support!

  9. #9
    Join Date
    Nov 2001
    Posts
    5,383
    Originally posted by neb1211
    While hardware can fail, it doesn't happen too often.
    Quite to the contrary I believe theirs quite a number of harddrive failures in this industry daily.
    Clustered Hosting With Continuous Data Protection (CDP)
    http://www.solidinternet.com
    8 Years of hosting excellence!

  10. #10
    Join Date
    Nov 2001
    Posts
    5,383
    Originally posted by adb22791
    A message from support when I asked:


    What does that mean? How can you experience the file system... someone please clarify.
    On the plus side, I got a response within 5 minutes, thumbs up for their tech support!
    Well his response doesn't even register here, I have no idea what he is trying to say maybe you should ask for clarification?
    Clustered Hosting With Continuous Data Protection (CDP)
    http://www.solidinternet.com
    8 Years of hosting excellence!

  11. #11
    Join Date
    Feb 2005
    Location
    Northern VA
    Posts
    1,582
    That's a pretty crummy tech support answer - "I'm new here and I'm relaying a message about something I don't understand..."

    Seems like they could get you a much better answer than one that makes no sense.

    I think a reliable host should have some redundancy so that if one system fails you don't lose your site until the hardware is back up.
    Rich
    Husband, Father, Retired Marine, Geek

  12. #12

    Thumbs up

    Update!
    From my web host:


    I would like to make a follow-up on today's [edited for privacy] server failure.

    During a scheduled kernel upgrade the server suffered an unforeseen software failure caused by faulty hardware. We immediately contacted our datacenter who failed to recover the server.

    It is [removed for privacy] policy at the occurrence of such failures to start our backup server. Untill a shortwhile ago your page was loaded from the backup machine. In the process of recovering the backup we have found that some customers have corrupted MYSQL databases and their databases could not be recovered from our backup. Fortunately, every month [removed for privacy] does special tape backups and we were able to recover databases, but these are about 20 days old.

    At the moment your page is loading from the [removed for privacy] server. It took us quite a while to fix the server because we had to wait for several hours for the datacenter technicians to recover the server and get it back online.

    The [removed for privacy] staff is doing our best to fix the problem and recover the data in order to have your account fully operational. We will work overtime to make sure your site is fully operational and you are not experiencing any trouble. I understand that problems like today's one are absolutely undesirable. I can assure you that the [removed for privacy] staff will not have a rest until every single site has been fixed and until all customers from the server are moved to a new, safer and more reliable server.


    The employee before does not usually answer support tickets, but because other employees were busy with the server and with the backup server, he answered my question.
    Seems like it will all end well... I will update when my site is fully functional.

  13. #13
    Join Date
    Sep 2000
    Posts
    389
    It seems to be just some bad luck. I would suggest giving it another chance.

  14. #14
    Join Date
    Apr 2001
    Location
    Montana USA
    Posts
    673
    This story brings up an interesting issue -- your web host had to wait for datacenter technicians to do something. Of course, some (a minority) web hosts ARE themselves datacenter technicians, and would have their hands on a hardware failure by running upstairs, or opening the server room door, or driving across town. Not sure if it's a big consideration when shopping for hosting, but it's something.
    John Masterson
    Former Hosting Company Owner

  15. #15
    Turns out my hosts server is located in The Planet datacenter...

  16. #16
    Join Date
    May 2005
    Location
    California, USA
    Posts
    20
    Odd I would think they would put in like mirroring or data protection or any of the sort. Probably bad hardware or just bad luck.

  17. #17
    But what happens if you are running RAID1, your main hard drive decided times up!
    Fails slowly over a few minutes all data is mirrored over to the second drive.

    So you then have two drives which are pretty useless.

  18. #18
    Join Date
    May 2005
    Location
    California, USA
    Posts
    20
    Hmm I think the best way then would be to probably use high-availability or deploy something of that sort. Either way, I would think mirroring would be better.
    Ealuco Inc.
    http://www.ealuco.com

  19. #19
    Join Date
    Feb 2002
    Location
    New York, NY
    Posts
    4,618
    Originally posted by Duport
    But what happens if you are running RAID1, your main hard drive decided times up!
    Fails slowly over a few minutes all data is mirrored over to the second drive.

    So you then have two drives which are pretty useless.
    What? If you're running a RAID1 mirror and one of the drives fails, it won't affect the other drive.
    Scott Burns, President
    BQ Internet Corporation
    Remote Rsync and FTP backup solutions
    *** http://www.bqbackup.com/ ***

  20. #20
    Join Date
    Sep 2004
    Location
    Dallas, TX
    Posts
    367
    This is one of the disadvantages of most web hosts. They have one server handling all requests. If a hardware or software failure brings the server down, your site will also go down. Although rare, there are some web hosts that offer a higher level of redundancy.

    The quality of hardware and whether or not the server was stress tested before deployment plays a key role in reducing hardware failure rates. “Dell PowerEdge” is a wide line of servers that could range anywhere from a $349 tower server to a $1299+ rack mount. We prefer to “burn-in” all servers for at least two weeks before bringing them into production.
    I N T H R I V E
    when you can't afford downtime
    sales@inthrive.com
    High Availability Web Hosting

  21. #21
    According to my host, they have a two server system. There is always a backup server (which has all the accounts backed up to it daily). I am happy with the host and support, but sometimes the site is very slow. I believe they are renting a dedicated from The Planet, and not colocating. Has anyone else had problems with hardware at The Planet (specifically when doing kernel upgrades?)

  22. #22
    Join Date
    Aug 2004
    Location
    South Daytona, FL
    Posts
    2,476
    Originally posted by bqinternet
    What? If you're running a RAID1 mirror and one of the drives fails, it won't affect the other drive.
    Not entirely true. If the drive doesn't fail completely there is a chance that data gets corrupted (and mirrored to the other drive) before the drive fails and the raid controller takes it offline. If that happens the remaining healthy drive is left with the mirrored corrupt data. This type of failure is very rare, but it does happen. I've also seen raid controllers incorrectly take the wrong drive offline when there was a problem.
    "Arms discourage and keep the invader and plunderer in awe, and preserve order in the world as well as property... Horrid mischief would ensue were the law-abiding deprived of the use of them." - Thomas Paine

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •