Page 1 of 2 12 LastLast
Results 1 to 25 of 28
  1. #1
    Join Date
    Sep 2002
    Location
    Top Secret
    Posts
    14,135

    How would you handle this situation?

    I'm at a loss here, really. Here's the deal, what I've been going through the past few hours with a "professional" dedicated server company (no names, it is what it is):


    A bit of background:
    I signed up with this company in November of last year. In 6 months, I've seen 2 hard drive failures, which are more than I've seen in my 10 years of working online, honestly . Sure, I know these happen, and I'm willing to forgive to a fault, but, I can't excuse everything here.

    So yesterday afternoon, client messages me and tells me her domain is not working. Not sure why my own monitors didn't catch it, other than it was responding but barely.

    As typical, I fire up APC, reboot the server, around 5pm (CST) yesterday. I tell the client to text me in 20 minutes if it's not up (usually it is, but sometimes it takes a while). Around 530, I check it, still not up, so I open a ticket with the DC, marked 'high priority' and let it go.

    Just over an hour later, tech responds to ticket stating that the primary (sda) hard drive was throwing a fit and he had to disable it. Turns out, he was partially right:



    [Load: 5.62] [Time: 08:23:46]
    [root@sithne /home/backuptmp/cpbackup/daily]: mount /dev/sda1 /backup/
    mount: /dev/sda1 already mounted or /backup/ busy
    [Load: 5.21] [Time: 08:26:38]
    [root@sithne /home/backuptmp/cpbackup/daily]: lsof /dev/sda1
    [Load: 5.20] [Time: 08:26:43]
    [Load: 5.01] [Time: 08:26:49]
    [root@sithne /home/backuptmp/cpbackup/daily]: lsof | grep sda1
    [Load: 5.76] [Time: 08:27:16]
    As you can see there, drive not mounting, even after multiple reboots, it still thinks it's mounted, or something. Checks tell me that it wasn't properly unmounted, can't fix it using fsck, fdisk gives out errors:

    Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel

    Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
    These don't concern me terribly much, they do tell me though that the drive is pretty much unrecoverable, and I need to image it to TRY to get what I can out of it. This, unfortunately is the problem.

    When the company told me last night the drive had to be disabled, I wasn't surprised. Like I said, second hard drive failure in 4 months (the last was end of December, early January). Thankfully this was the old backup drive, so it only held mysql data and stuff I didn't need 100% (well, except for some archived stuff). MySQL data was backed up, so not horribly concerned there either.

    The problem I have is with the company's response here, or lack thereof, to a critical situation. 7 pm was their last response last night (though apparently they logged into the server via console), and 12 hours later, I get an email stating they feel it's best to leave their services (no duuh, really)? No "we're sorry we screwed up again", no "we'll put another drive in for you, since ours are horrible", simply "we feel it's best you leave".

    So, I'm being told to leave services because a company can't provide proper hardware to keep clients happy? In 6 months, I've had 9 tickets open (2 due to hardware failures, 2 network, 2 APC), nothing hugely unusual. I'm not a burdensome customer, in fact I pay my bills,and most months that's that.

    The problem? I need the data off the backup drive. While it's not mission critical, there was around 30G of VB attachments, and a ton of backups.. This is a case of not getting what one paid for, I'd think, no?

    Anyways, aside from moving, what would you do here? I've insisted a number of times that they put a proper drive in so I can image the old one (for retrieval purposes), but it's like they just don't want to follow simple instructions here (or to actually do their jobs and answer tickets).

    Edit:
    The drive in question? SAMSUNG HD502IJ . Just doing a bit more research shows these aren't horribly great drives. Cheap, but not really great.
    Last edited by whmcsguru; 04-12-2011 at 08:53 AM.
    Tom Whiting, WHMCS Guru extraordinaire
    Linux problems? WHMCS Problems? Give me a shout
    Check out my WHMCS Addons

  2. #2
    Maybe you can ask the data center to replace the drive with a more fit drive that you have researched they order like the rest of us. Yes maybe they might charge a bit more for the special HDD but at least you know if it is a good one. also would ask if they would connect it up to where you can try and get data off of it or even if you could buy it to try and imige it. I would allow any of my clients to do this. just because we know that the clients dat is the number one priority.

  3. #3
    Join Date
    Sep 2002
    Location
    Top Secret
    Posts
    14,135
    I've actually asked multiple times for them to place a new drive in the server so I can image the old one. Been going on about that since about 10pm last night (CST), after it became obvious that was necessary. They're pretty much just ignoring that right now.
    Tom Whiting, WHMCS Guru extraordinaire
    Linux problems? WHMCS Problems? Give me a shout
    Check out my WHMCS Addons

  4. #4
    Join Date
    Jun 2003
    Location
    UK
    Posts
    6,616
    Did you get any temperatures from the server? Something seems a bit missing to me between you asking for the hardware to be replaced to them asking you to leave the service.
    Russ Foster - Industry Curmudgeon
    Freelance Sysadmin for Hire - email vaserv@gmail.com

  5. #5
    Join Date
    Jan 2005
    Location
    Scotland, UK
    Posts
    2,681
    They asked you to leave because of what exactly? Are they talking about response time when they said they "screwed up"?

    Also why do you need the data of the drive, not easier just to restore backups and move on? Why bother with a provider that doesn't want you and is slow in emergencies?
    Server Management - AdminGeekZ.com
    Infrastructure Management, Web Application Performance, mySQL DBA. System Automation.
    WordPress/Magento Performance, Apache to Nginx Conversion, Varnish Implimentation, DDoS Protection, Custom Nginx Modules
    Check our wordpress varnish plugin. Contact us for quote: sales@admingeekz.com

  6. #6
    WOW that is amazing that they are ignoring you due to a hardware failure. any dedicated service provider Knows that that is their #1 prio.
    Last edited by Hostd8; 04-12-2011 at 09:53 AM.

  7. #7
    Join Date
    Sep 2002
    Location
    Top Secret
    Posts
    14,135
    Quote Originally Posted by rghf View Post
    Did you get any temperatures from the server? Something seems a bit missing to me between you asking for the hardware to be replaced to them asking you to leave the service.
    Actually wasn't able to do so. i wasn't at home wen I get the message it was down, so it was just a simple APC reboot, which it never recovered from.

    Quote Originally Posted by Scott.Mc View Post
    They asked you to leave because of what exactly?
    Their exact wording:
    In light of the recent events, your dissatisfaction with the network, the server hardware and support we think it is best at this time to not continue our business arrangement.
    I dunno what 'recent events' they're talking about here, I haven't had a ticket opened up in close to a month, and that was more of a DDOS issue on their network.

    As far as lack of satisfaction with hardware and support, well, I think it's reasonable to expect anyone would be unsatisfied if mission critical sites were kept offline for hours because they gave poor hardware. Not even a peep for 12 hours.
    Quote Originally Posted by Scott.Mc View Post
    Are they talking about response time when they said they "screwed up"?
    Oh no, they're not saying they screwed up at all, not accepting any responsibility for the drive malfunction.

    Quote Originally Posted by Scott.Mc View Post
    Also why do you need the data of the drive, not easier just to restore backups and move on?
    There's two problems with that, in this case:

    #1: MySQL was the only thing lost that pertained to a majority of the accounts. There was about 15GB of data and about 5k databases that had to be restored manually (which was easily enough done once the backup's backup got restored, though it was a couple days old)

    #2: vBulletin attachments for two accounts totaling around 30G + were stored on the backup drive as well. These were a couple fansites for bands I enjoy listening to. While I might have the data backed up locally, it's still a nightmare getting it back in. This was put onto a backup partition because I didn't want to clutter the home partition with it, and at one point I thought it might help spread the hardware wear out a bit, just like putting sql on the backup drive.

    Yeah, I know, it's my own fault for not backing up the vB attachment data, and there's nobody to blame for that but me, but I can't help but try to get that data back somehow.
    Quote Originally Posted by Scott.Mc View Post
    Why bother with a provider that doesn't want you and is slow in emergencies?
    Like I told them, at my earliest convenience, I'm outta there, believe me. I can buy hardware failures once, but repeated hardware failures, and poor response times means it's time to move on, and I am, but I still need the data contained on that drive for the 2 sites to continue to function. Yeah, shoulda been backed up, I know, but the current backup situation is very limited.
    Tom Whiting, WHMCS Guru extraordinaire
    Linux problems? WHMCS Problems? Give me a shout
    Check out my WHMCS Addons

  8. #8
    Join Date
    Feb 2007
    Location
    Chicago, IL
    Posts
    205
    What company is this? Seems a bit unusual to give a client the boot because they have issues with your network or bad hardware.
    Marc Schulz
    Managing Implementation Engineer
    Steadfast : Managed Cloud & Infrastructure Services

  9. #9
    Join Date
    Jan 2011
    Location
    Ohio
    Posts
    467
    GBLX I believe is the provider...

  10. #10
    Join Date
    Sep 2002
    Location
    Top Secret
    Posts
    14,135
    Quote Originally Posted by chrono-it View Post
    What company is this? Seems a bit unusual to give a client the boot because they have issues with your network or bad hardware.
    This isn't a rather large company, as I found out just getting into the deal. I'm not going to name the company here though.

    Quote Originally Posted by bluemer View Post
    GBLX I believe is the provider...
    They are in the bandwidth mix, yes, but they are not actually the provider of anything but bandwidth. The server is hosted by what was thought to be a professional datacenter.
    Tom Whiting, WHMCS Guru extraordinaire
    Linux problems? WHMCS Problems? Give me a shout
    Check out my WHMCS Addons

  11. #11
    Join Date
    Jan 2005
    Posts
    450
    Quote Originally Posted by chrono-it View Post
    What company is this? Seems a bit unusual to give a client the boot because they have issues with your network or bad hardware.
    Pure speculation here, but I can't help but wonder if they are giving him the boot because he flew off the handle and give them a 'piece of his mind'...

    At any rate, if they haven't gone as far as powering down the box and telling you to get lost, they should be willing to throw another drive in the box for you to recover your data.

    Just my two cents
    CityWideHost.com - Web Hosting YOUR Way!
    Non-Oversold Hosting, Webmaster Services, Web Design, and Asterisk PBX Management!
    24/7 Support - Powered by Bobcares!

  12. #12
    Join Date
    Dec 2009
    Location
    Maryland, USA.
    Posts
    52
    Wow... linux-tech, that really sucks.

    What would I do if I was in your situation? Well, I don't think that there would be nothing else to do but to just move away from them.

    But when you get a new hard drive, you should run a linux badblocks check on it. It writes random data on all of the sectors on a hard drive to check for any bad sectors. Even though this will not tell you whether the electronics on the hard drive will fail or not, it will still tell you whether there are any bad sectors on the drive before you start using it and putting important data on it. I've actually been able to detect many bad "new" hard drives and send them back for a replacement this way.

    en.wikipedia.org/wiki/Badblocks

  13. #13
    Join Date
    Sep 2002
    Location
    Top Secret
    Posts
    14,135
    Quote Originally Posted by citywidehost View Post
    Pure speculation here, but I can't help but wonder if they are giving him the boot because he flew off the handle and give them a 'piece of his mind'...
    Actually, I was very, very tame, despite the fact that they still haven't bothered to do what they were instructed to do last night. See for yourself, I've nothing to hide, and honestly, I would be completely fine with a client handling an emergency situation the same way.

    me
    Client 04/11/2011 21:25
    This is the second hard drive problem you've provided me with in 4 months, which, of course is unacceptable. Right now, all of my services are down because MySQL (which was the only thing on the secondary drive) is pretty much unusable.

    Right now, this service is pretty unacceptable. An hour delay for a critical ticket, 2 hard drive failures (which is exactly what this is), no urgent response whatsoever. Something needs to be changed here.

    Right now, I'm in the process of creating an image of this drive. Once done (tonight at some point), you'll need to swap out the drive, and some sort of reimbursement for my time will need to be added here.

    Your hardware caused this issue, and this needs to be fixed.

    me
    Client 04/11/2011 22:23
    In order for this to be fixed, you'll need to put in a replacement hard drive while leaving the other 2 intact. This hard drive must be professional grade and quality, and this needs to be done tonight.

    These hard drive failures are too much, really. I can't afford hours of downtime because of cheap parts being thrown into this server.

    Take the server down to add the new hard drive whenever, it is worthless without the services that are required. Do NOT try to image the drive yourself, do NOT do anything but follow specific instructions, which are to take the server down, add a server grade hard drive that should have been put in on install, and put the server back online. Notify me once you have done so.

    me
    Client 04/12/2011 01:15
    Ahh yes, the sound of silence, nothing being done on an urgent ticket in hours. Do tell, WHEN do you actually plan on following instructions here, or do you just plan on keeping my business down because of your hardware issues permamently?

    me
    Client 04/12/2011 06:05
    HELLO??????
    For 12+ hours now, my business has been down because of your hardware issues. What, exactly do you plan on doing about this? This is NOT because of an OS issue, as observed, right here:
    "Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)".
    This drive is corrupt, WHEN will you get around to replacing it as instructed hours ago?
    I'll admit, I got progressively more agitated with the situation, but, come on, who in their right mind here wouldn't? Absolutely zero responses from 7pm to 7am in a critical situation? I ended up sleeping on the couch, checking the backup sync every few hours, checking email every few hours, and still nothing from them. If I treated my clients like this, I would expect an earful, quite literally.

    They were given very specific instructions, because the last time this happened, they botched the whole thing up, taking liberties they were not allowed to by attempting to image a drive instead of replacing it, which ended up failing, costing me more downtime than they would have if they simply just swapped the drive out and let me restore from backup as stated.

    Quote Originally Posted by citywidehost View Post
    At any rate, if they haven't gone as far as powering down the box and telling you to get lost, they should be willing to throw another drive in the box for you to recover your data.
    You'd think that would be the case, wouldn't you? Sadly, still no ternary drive.

    While I tried to image (and compress) the data from the bad drive to the primary, it wasn't possible to do, even using gzip, seeing as how the two are exactly the same size, it didn't work . Couldn't help but try, again.

    Edit:
    Yes, I realize I said MySQL was the only thing on the drive in the ticket, only because I had forgotten about the setup for the vb attachments. There may even be other stuff on that drive I don't recall putting there.
    Last edited by whmcsguru; 04-12-2011 at 11:36 AM.
    Tom Whiting, WHMCS Guru extraordinaire
    Linux problems? WHMCS Problems? Give me a shout
    Check out my WHMCS Addons

  14. #14
    Join Date
    Apr 2009
    Location
    Dallas/FortWorth TX
    Posts
    1,703
    I guess they asked you to leave because new drives cost a lot (lol) and there are none available on craigslist or fleabay.

    LMAO
    <<< Please see Forum Guidelines for signature setup. >>>

  15. #15
    Join Date
    Aug 2003
    Location
    Montréal
    Posts
    953
    Ask your provider if they can send the drive to a data recovery service. We use Vital Data in such cases (http://www.vitaldata.ca/) they are very good but there are many other options. Vital Data will be able to let you know if they can recover the data and how much it would cost. If the data is very important for you I guess you will be willing to pay the price.
    :: Martin Leclair
    :: Linkedin Profile

  16. #16
    Quote Originally Posted by webphyre View Post
    en.wikipedia.org/wiki/Badblocks
    A new tool to put in my arsenal. Thanks
    IOFLOOD.com -- We Love Servers
    Phoenix, AZ Dedicated Servers in under an hour
    ★ Ryzen 9: 7950x3D ★ Dual E5-2680v4 Xeon ★
    Contact Us: sales@ioflood.com

  17. #17
    Join Date
    Dec 2009
    Location
    Maryland, USA.
    Posts
    52
    Quote Originally Posted by funkywizard View Post
    A new tool to put in my arsenal. Thanks
    No problem!

    Just remember to run it twice

    The first run should find any bad blocks on the drive so that the drive knows not to use that bad sector in the future. If you still get bad sectors on the second run, then that drive will more than likely go bad on you and you should try to return it and exchange it for a different one.

  18. #18
    Join Date
    Sep 2002
    Location
    Top Secret
    Posts
    14,135
    Quote Originally Posted by atchoooo View Post
    Ask your provider if they can send the drive to a data recovery service.
    Why? It's not necessary to send the drive off, all the provider has to do is put a ternary drive in the server. Once that's done, I can use dd to create an image of the drive, mount it, check it, and take what I need from the image. It's not always necessary to send the drive off.

    The problem is that this specific "datacenter" (yeah, right) doesn't actually want to do their job here. They'd just rather blame the customer for their poor drives (Samsung, really??)
    Tom Whiting, WHMCS Guru extraordinaire
    Linux problems? WHMCS Problems? Give me a shout
    Check out my WHMCS Addons

  19. #19
    Quote Originally Posted by linux-tech View Post
    They'd just rather blame the customer for their poor drives (Samsung, really??)
    The choice of a drive brand is often up to preference. Mac vs Windows, Oracle vs Mysql. Although I personally don't prefer Samsung drives, I wouldn't think someone was being irresponsible by choosing to use them. I had a bunch of WD and Samsung drives in 10 colo machines I set up a few years back, and yes, the WD drives were more reliable and faster, but the Samsung drives weren't total junk. Though I personally stick to WD in almost all cases, I'm sure there's plenty of people out there who would similarly find fault with that decision of mine as well. To each his own.
    IOFLOOD.com -- We Love Servers
    Phoenix, AZ Dedicated Servers in under an hour
    ★ Ryzen 9: 7950x3D ★ Dual E5-2680v4 Xeon ★
    Contact Us: sales@ioflood.com

  20. #20
    Join Date
    Sep 2002
    Location
    Top Secret
    Posts
    14,135
    I should have phrased that better, something like cheap Samsung drives. I'm sure that there's a good purpose for Samsung drives, somewhere, but like Seagate and WD, the cheaper the drive, the more problematic it is. In this case, it's obvious that either they got a REALLY bad lot of these, or they're cheap (around $40/drive) for a reason (because they're made poorly), as made obvious by the fact that 2 have gone bad in the past 6 months.
    Tom Whiting, WHMCS Guru extraordinaire
    Linux problems? WHMCS Problems? Give me a shout
    Check out my WHMCS Addons

  21. #21
    Quote Originally Posted by linux-tech View Post
    I should have phrased that better, something like cheap Samsung drives. I'm sure that there's a good purpose for Samsung drives, somewhere, but like Seagate and WD, the cheaper the drive, the more problematic it is. In this case, it's obvious that either they got a REALLY bad lot of these, or they're cheap (around $40/drive) for a reason (because they're made poorly), as made obvious by the fact that 2 have gone bad in the past 6 months.
    I had a provider once where the failure rate on drives was so high that I wouldn't have been surprised if the "dead pile" and the "new pile" were mixed together as a matter of policy Could be any number of reasons for the failures: vibration, heat, bad luck, or buying cheap drives, as you suggested.
    IOFLOOD.com -- We Love Servers
    Phoenix, AZ Dedicated Servers in under an hour
    ★ Ryzen 9: 7950x3D ★ Dual E5-2680v4 Xeon ★
    Contact Us: sales@ioflood.com

  22. #22
    Join Date
    Sep 2002
    Location
    Top Secret
    Posts
    14,135
    Just a quick update here:
    After what seemed like an eternity (20 hours), the DC has placed a new drive in the server for me to use. Not the exact same specs, but that can be good as well as bad.

    Unfortunately, looks like the same situation with this drive, as creating a partition, mounting the drive gives the same errors, before reboot, and it looks like it's not coming back up.

    Ticket opened, we'll see how long they take to actually get the server back. At this point, I'm looking at getting out of there in the next few days, because, this kind of inconsistency can't keep up.
    Tom Whiting, WHMCS Guru extraordinaire
    Linux problems? WHMCS Problems? Give me a shout
    Check out my WHMCS Addons

  23. #23
    Quote Originally Posted by linux-tech View Post
    [From his Signature]: Datacenter Talk Forums : Server and datacenter discussion and help.
    I can't help notice the irony that the man who runs the Datacenter Forum, can't catch a break form the Datacenters
    I'm not sure why you feel the need to protect the company that is working you over. Name them, and let the power of negative publicity convince them to do their job with integrity.

  24. #24
    Join Date
    Mar 2003
    Location
    California USA
    Posts
    13,681
    Quote Originally Posted by linux-tech View Post
    I've actually asked multiple times for them to place a new drive in the server so I can image the old one. Been going on about that since about 10pm last night (CST), after it became obvious that was necessary. They're pretty much just ignoring that right now.
    Why didn't you image over netcat or ssh to another server instead of wasting time waiting on them?


    Regarding Samsung - some people have very good luck with them. The same goes with any drive vendor, some people have bad luck with WD and some people have bad luck with Seagate - All Providers are dropping in quality imho.
    Steven Ciaburri | Industry's Best Server Management - Rack911.com
    Software Auditing - 400+ Vulnerabilities Found - Quote @ https://www.RACK911Labs.com
    Fully Managed Dedicated Servers (Las Vegas, New York City, & Amsterdam) (AS62710)
    FreeBSD & Linux Server Management, Security Auditing, Server Optimization, PCI Compliance

  25. #25
    Join Date
    Feb 2007
    Location
    Chicago, IL
    Posts
    205
    Have you eliminated a SATA Port malfunction? It seems odd to have this much bad luck even with horrible drives.
    Marc Schulz
    Managing Implementation Engineer
    Steadfast : Managed Cloud & Infrastructure Services

Page 1 of 2 12 LastLast

Similar Threads

  1. Request for advice on how to handle situation with Steven (rack911)
    By Isaac.Eiland-Hall in forum Dedicated Server
    Replies: 3
    Last Post: 07-08-2006, 04:25 AM
  2. How do you handle this situation...
    By thomas.smith in forum Running a Web Hosting Business
    Replies: 16
    Last Post: 07-01-2005, 10:36 AM
  3. What would you have done in a situation like this?
    By nogi in forum Hosting Security and Technology
    Replies: 18
    Last Post: 03-05-2003, 05:42 PM
  4. What do i do in this situation?
    By SuperDon in forum Running a Web Hosting Business
    Replies: 15
    Last Post: 12-01-2001, 06:06 AM
  5. What to do in a situation like this ?
    By CRego3D in forum Web Hosting
    Replies: 29
    Last Post: 11-03-2000, 03:06 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •