Results 1 to 25 of 28
-
04-12-2011, 08:48 AM #1
How would you handle this situation?
I'm at a loss here, really. Here's the deal, what I've been going through the past few hours with a "professional" dedicated server company (no names, it is what it is):
A bit of background:
I signed up with this company in November of last year. In 6 months, I've seen 2 hard drive failures, which are more than I've seen in my 10 years of working online, honestly . Sure, I know these happen, and I'm willing to forgive to a fault, but, I can't excuse everything here.
So yesterday afternoon, client messages me and tells me her domain is not working. Not sure why my own monitors didn't catch it, other than it was responding but barely.
As typical, I fire up APC, reboot the server, around 5pm (CST) yesterday. I tell the client to text me in 20 minutes if it's not up (usually it is, but sometimes it takes a while). Around 530, I check it, still not up, so I open a ticket with the DC, marked 'high priority' and let it go.
Just over an hour later, tech responds to ticket stating that the primary (sda) hard drive was throwing a fit and he had to disable it. Turns out, he was partially right:
[Load: 5.62] [Time: 08:23:46]
[root@sithne /home/backuptmp/cpbackup/daily]: mount /dev/sda1 /backup/
mount: /dev/sda1 already mounted or /backup/ busy
[Load: 5.21] [Time: 08:26:38]
[root@sithne /home/backuptmp/cpbackup/daily]: lsof /dev/sda1
[Load: 5.20] [Time: 08:26:43]
[Load: 5.01] [Time: 08:26:49]
[root@sithne /home/backuptmp/cpbackup/daily]: lsof | grep sda1
[Load: 5.76] [Time: 08:27:16]
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
When the company told me last night the drive had to be disabled, I wasn't surprised. Like I said, second hard drive failure in 4 months (the last was end of December, early January). Thankfully this was the old backup drive, so it only held mysql data and stuff I didn't need 100% (well, except for some archived stuff). MySQL data was backed up, so not horribly concerned there either.
The problem I have is with the company's response here, or lack thereof, to a critical situation. 7 pm was their last response last night (though apparently they logged into the server via console), and 12 hours later, I get an email stating they feel it's best to leave their services (no duuh, really)? No "we're sorry we screwed up again", no "we'll put another drive in for you, since ours are horrible", simply "we feel it's best you leave".
So, I'm being told to leave services because a company can't provide proper hardware to keep clients happy? In 6 months, I've had 9 tickets open (2 due to hardware failures, 2 network, 2 APC), nothing hugely unusual. I'm not a burdensome customer, in fact I pay my bills,and most months that's that.
The problem? I need the data off the backup drive. While it's not mission critical, there was around 30G of VB attachments, and a ton of backups.. This is a case of not getting what one paid for, I'd think, no?
Anyways, aside from moving, what would you do here? I've insisted a number of times that they put a proper drive in so I can image the old one (for retrieval purposes), but it's like they just don't want to follow simple instructions here (or to actually do their jobs and answer tickets).
Edit:
The drive in question? SAMSUNG HD502IJ . Just doing a bit more research shows these aren't horribly great drives. Cheap, but not really great.Last edited by whmcsguru; 04-12-2011 at 08:53 AM.
Tom Whiting, WHMCS Guru extraordinaire
Linux problems? WHMCS Problems? Give me a shout
Check out my WHMCS Addons
-
04-12-2011, 09:28 AM #2WHT Addict
- Join Date
- May 2010
- Posts
- 118
Maybe you can ask the data center to replace the drive with a more fit drive that you have researched they order like the rest of us. Yes maybe they might charge a bit more for the special HDD but at least you know if it is a good one. also would ask if they would connect it up to where you can try and get data off of it or even if you could buy it to try and imige it. I would allow any of my clients to do this. just because we know that the clients dat is the number one priority.
-
04-12-2011, 09:32 AM #3
I've actually asked multiple times for them to place a new drive in the server so I can image the old one. Been going on about that since about 10pm last night (CST), after it became obvious that was necessary. They're pretty much just ignoring that right now.
Tom Whiting, WHMCS Guru extraordinaire
Linux problems? WHMCS Problems? Give me a shout
Check out my WHMCS Addons
-
04-12-2011, 09:37 AM #4Web Hosting Master
- Join Date
- Jun 2003
- Location
- UK
- Posts
- 6,616
Did you get any temperatures from the server? Something seems a bit missing to me between you asking for the hardware to be replaced to them asking you to leave the service.
Russ Foster - Industry Curmudgeon
Freelance Sysadmin for Hire - email vaserv@gmail.com
-
04-12-2011, 09:39 AM #5Engineer
- Join Date
- Jan 2005
- Location
- Scotland, UK
- Posts
- 2,681
They asked you to leave because of what exactly? Are they talking about response time when they said they "screwed up"?
Also why do you need the data of the drive, not easier just to restore backups and move on? Why bother with a provider that doesn't want you and is slow in emergencies?Server Management - AdminGeekZ.com
Infrastructure Management, Web Application Performance, mySQL DBA. System Automation.
WordPress/Magento Performance, Apache to Nginx Conversion, Varnish Implimentation, DDoS Protection, Custom Nginx Modules
Check our wordpress varnish plugin. Contact us for quote: sales@admingeekz.com
-
04-12-2011, 09:48 AM #6WHT Addict
- Join Date
- May 2010
- Posts
- 118
WOW that is amazing that they are ignoring you due to a hardware failure. any dedicated service provider Knows that that is their #1 prio.
Last edited by Hostd8; 04-12-2011 at 09:53 AM.
-
04-12-2011, 09:56 AM #7
Actually wasn't able to do so. i wasn't at home wen I get the message it was down, so it was just a simple APC reboot, which it never recovered from.
Their exact wording:
In light of the recent events, your dissatisfaction with the network, the server hardware and support we think it is best at this time to not continue our business arrangement.
As far as lack of satisfaction with hardware and support, well, I think it's reasonable to expect anyone would be unsatisfied if mission critical sites were kept offline for hours because they gave poor hardware. Not even a peep for 12 hours.
Oh no, they're not saying they screwed up at all, not accepting any responsibility for the drive malfunction.
There's two problems with that, in this case:
#1: MySQL was the only thing lost that pertained to a majority of the accounts. There was about 15GB of data and about 5k databases that had to be restored manually (which was easily enough done once the backup's backup got restored, though it was a couple days old)
#2: vBulletin attachments for two accounts totaling around 30G + were stored on the backup drive as well. These were a couple fansites for bands I enjoy listening to. While I might have the data backed up locally, it's still a nightmare getting it back in. This was put onto a backup partition because I didn't want to clutter the home partition with it, and at one point I thought it might help spread the hardware wear out a bit, just like putting sql on the backup drive.
Yeah, I know, it's my own fault for not backing up the vB attachment data, and there's nobody to blame for that but me, but I can't help but try to get that data back somehow.
Like I told them, at my earliest convenience, I'm outta there, believe me. I can buy hardware failures once, but repeated hardware failures, and poor response times means it's time to move on, and I am, but I still need the data contained on that drive for the 2 sites to continue to function. Yeah, shoulda been backed up, I know, but the current backup situation is very limited.Tom Whiting, WHMCS Guru extraordinaire
Linux problems? WHMCS Problems? Give me a shout
Check out my WHMCS Addons
-
04-12-2011, 10:31 AM #8Junior Guru
- Join Date
- Feb 2007
- Location
- Chicago, IL
- Posts
- 205
What company is this? Seems a bit unusual to give a client the boot because they have issues with your network or bad hardware.
-
04-12-2011, 10:43 AM #9Web Hosting Evangelist
- Join Date
- Jan 2011
- Location
- Ohio
- Posts
- 467
GBLX I believe is the provider...
-
04-12-2011, 10:56 AM #10
This isn't a rather large company, as I found out just getting into the deal. I'm not going to name the company here though.
They are in the bandwidth mix, yes, but they are not actually the provider of anything but bandwidth. The server is hosted by what was thought to be a professional datacenter.Tom Whiting, WHMCS Guru extraordinaire
Linux problems? WHMCS Problems? Give me a shout
Check out my WHMCS Addons
-
04-12-2011, 10:59 AM #11Web Hosting Evangelist
- Join Date
- Jan 2005
- Posts
- 450
Pure speculation here, but I can't help but wonder if they are giving him the boot because he flew off the handle and give them a 'piece of his mind'...
At any rate, if they haven't gone as far as powering down the box and telling you to get lost, they should be willing to throw another drive in the box for you to recover your data.
Just my two centsCityWideHost.com - Web Hosting YOUR Way!
Non-Oversold Hosting, Webmaster Services, Web Design, and Asterisk PBX Management!
24/7 Support - Powered by Bobcares!
-
04-12-2011, 11:15 AM #12Junior Guru Wannabe
- Join Date
- Dec 2009
- Location
- Maryland, USA.
- Posts
- 52
Wow... linux-tech, that really sucks.
What would I do if I was in your situation? Well, I don't think that there would be nothing else to do but to just move away from them.
But when you get a new hard drive, you should run a linux badblocks check on it. It writes random data on all of the sectors on a hard drive to check for any bad sectors. Even though this will not tell you whether the electronics on the hard drive will fail or not, it will still tell you whether there are any bad sectors on the drive before you start using it and putting important data on it. I've actually been able to detect many bad "new" hard drives and send them back for a replacement this way.
en.wikipedia.org/wiki/Badblocks
-
04-12-2011, 11:24 AM #13
Actually, I was very, very tame, despite the fact that they still haven't bothered to do what they were instructed to do last night. See for yourself, I've nothing to hide, and honestly, I would be completely fine with a client handling an emergency situation the same way.
me
Client 04/11/2011 21:25
This is the second hard drive problem you've provided me with in 4 months, which, of course is unacceptable. Right now, all of my services are down because MySQL (which was the only thing on the secondary drive) is pretty much unusable.
Right now, this service is pretty unacceptable. An hour delay for a critical ticket, 2 hard drive failures (which is exactly what this is), no urgent response whatsoever. Something needs to be changed here.
Right now, I'm in the process of creating an image of this drive. Once done (tonight at some point), you'll need to swap out the drive, and some sort of reimbursement for my time will need to be added here.
Your hardware caused this issue, and this needs to be fixed.
me
Client 04/11/2011 22:23
In order for this to be fixed, you'll need to put in a replacement hard drive while leaving the other 2 intact. This hard drive must be professional grade and quality, and this needs to be done tonight.
These hard drive failures are too much, really. I can't afford hours of downtime because of cheap parts being thrown into this server.
Take the server down to add the new hard drive whenever, it is worthless without the services that are required. Do NOT try to image the drive yourself, do NOT do anything but follow specific instructions, which are to take the server down, add a server grade hard drive that should have been put in on install, and put the server back online. Notify me once you have done so.
me
Client 04/12/2011 01:15
Ahh yes, the sound of silence, nothing being done on an urgent ticket in hours. Do tell, WHEN do you actually plan on following instructions here, or do you just plan on keeping my business down because of your hardware issues permamently?
me
Client 04/12/2011 06:05
HELLO??????
For 12+ hours now, my business has been down because of your hardware issues. What, exactly do you plan on doing about this? This is NOT because of an OS issue, as observed, right here:
"Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)".
This drive is corrupt, WHEN will you get around to replacing it as instructed hours ago?
They were given very specific instructions, because the last time this happened, they botched the whole thing up, taking liberties they were not allowed to by attempting to image a drive instead of replacing it, which ended up failing, costing me more downtime than they would have if they simply just swapped the drive out and let me restore from backup as stated.
You'd think that would be the case, wouldn't you? Sadly, still no ternary drive.
While I tried to image (and compress) the data from the bad drive to the primary, it wasn't possible to do, even using gzip, seeing as how the two are exactly the same size, it didn't work . Couldn't help but try, again.
Edit:
Yes, I realize I said MySQL was the only thing on the drive in the ticket, only because I had forgotten about the setup for the vb attachments. There may even be other stuff on that drive I don't recall putting there.Last edited by whmcsguru; 04-12-2011 at 11:36 AM.
Tom Whiting, WHMCS Guru extraordinaire
Linux problems? WHMCS Problems? Give me a shout
Check out my WHMCS Addons
-
04-12-2011, 12:12 PM #14Web Hosting Master
- Join Date
- Apr 2009
- Location
- Dallas/FortWorth TX
- Posts
- 1,703
I guess they asked you to leave because new drives cost a lot (lol) and there are none available on craigslist or fleabay.
LMAO<<< Please see Forum Guidelines for signature setup. >>>
-
04-12-2011, 12:22 PM #15Web Hosting Master
- Join Date
- Aug 2003
- Location
- Montréal
- Posts
- 953
Ask your provider if they can send the drive to a data recovery service. We use Vital Data in such cases (http://www.vitaldata.ca/) they are very good but there are many other options. Vital Data will be able to let you know if they can recover the data and how much it would cost. If the data is very important for you I guess you will be willing to pay the price.
:: Martin Leclair
:: Linkedin Profile
-
04-12-2011, 01:00 PM #16IOFLOOD.com -- We Love Servers
Phoenix, AZ Dedicated Servers in under an hour
★ Ryzen 9: 7950x3D ★ Dual E5-2680v4 Xeon ★
Contact Us: sales@ioflood.com ★
-
04-12-2011, 01:42 PM #17Junior Guru Wannabe
- Join Date
- Dec 2009
- Location
- Maryland, USA.
- Posts
- 52
No problem!
Just remember to run it twice
The first run should find any bad blocks on the drive so that the drive knows not to use that bad sector in the future. If you still get bad sectors on the second run, then that drive will more than likely go bad on you and you should try to return it and exchange it for a different one.
-
04-12-2011, 05:25 PM #18
Why? It's not necessary to send the drive off, all the provider has to do is put a ternary drive in the server. Once that's done, I can use dd to create an image of the drive, mount it, check it, and take what I need from the image. It's not always necessary to send the drive off.
The problem is that this specific "datacenter" (yeah, right) doesn't actually want to do their job here. They'd just rather blame the customer for their poor drives (Samsung, really??)Tom Whiting, WHMCS Guru extraordinaire
Linux problems? WHMCS Problems? Give me a shout
Check out my WHMCS Addons
-
04-12-2011, 05:55 PM #19
The choice of a drive brand is often up to preference. Mac vs Windows, Oracle vs Mysql. Although I personally don't prefer Samsung drives, I wouldn't think someone was being irresponsible by choosing to use them. I had a bunch of WD and Samsung drives in 10 colo machines I set up a few years back, and yes, the WD drives were more reliable and faster, but the Samsung drives weren't total junk. Though I personally stick to WD in almost all cases, I'm sure there's plenty of people out there who would similarly find fault with that decision of mine as well. To each his own.
IOFLOOD.com -- We Love Servers
Phoenix, AZ Dedicated Servers in under an hour
★ Ryzen 9: 7950x3D ★ Dual E5-2680v4 Xeon ★
Contact Us: sales@ioflood.com ★
-
04-12-2011, 06:02 PM #20
I should have phrased that better, something like cheap Samsung drives. I'm sure that there's a good purpose for Samsung drives, somewhere, but like Seagate and WD, the cheaper the drive, the more problematic it is. In this case, it's obvious that either they got a REALLY bad lot of these, or they're cheap (around $40/drive) for a reason (because they're made poorly), as made obvious by the fact that 2 have gone bad in the past 6 months.
Tom Whiting, WHMCS Guru extraordinaire
Linux problems? WHMCS Problems? Give me a shout
Check out my WHMCS Addons
-
04-12-2011, 06:05 PM #21
I had a provider once where the failure rate on drives was so high that I wouldn't have been surprised if the "dead pile" and the "new pile" were mixed together as a matter of policy Could be any number of reasons for the failures: vibration, heat, bad luck, or buying cheap drives, as you suggested.
IOFLOOD.com -- We Love Servers
Phoenix, AZ Dedicated Servers in under an hour
★ Ryzen 9: 7950x3D ★ Dual E5-2680v4 Xeon ★
Contact Us: sales@ioflood.com ★
-
04-12-2011, 07:07 PM #22
Just a quick update here:
After what seemed like an eternity (20 hours), the DC has placed a new drive in the server for me to use. Not the exact same specs, but that can be good as well as bad.
Unfortunately, looks like the same situation with this drive, as creating a partition, mounting the drive gives the same errors, before reboot, and it looks like it's not coming back up.
Ticket opened, we'll see how long they take to actually get the server back. At this point, I'm looking at getting out of there in the next few days, because, this kind of inconsistency can't keep up.Tom Whiting, WHMCS Guru extraordinaire
Linux problems? WHMCS Problems? Give me a shout
Check out my WHMCS Addons
-
04-12-2011, 07:29 PM #23Web Hosting Evangelist
- Join Date
- Dec 2006
- Posts
- 493
I can't help notice the irony that the man who runs the Datacenter Forum, can't catch a break form the Datacenters
I'm not sure why you feel the need to protect the company that is working you over. Name them, and let the power of negative publicity convince them to do their job with integrity.
-
04-12-2011, 07:39 PM #24Problem Solver
- Join Date
- Mar 2003
- Location
- California USA
- Posts
- 13,681
Why didn't you image over netcat or ssh to another server instead of wasting time waiting on them?
Regarding Samsung - some people have very good luck with them. The same goes with any drive vendor, some people have bad luck with WD and some people have bad luck with Seagate - All Providers are dropping in quality imho.Steven Ciaburri | Industry's Best Server Management - Rack911.com
Software Auditing - 400+ Vulnerabilities Found - Quote @ https://www.RACK911Labs.com
Fully Managed Dedicated Servers (Las Vegas, New York City, & Amsterdam) (AS62710)
FreeBSD & Linux Server Management, Security Auditing, Server Optimization, PCI Compliance
-
04-12-2011, 07:50 PM #25Junior Guru
- Join Date
- Feb 2007
- Location
- Chicago, IL
- Posts
- 205
Have you eliminated a SATA Port malfunction? It seems odd to have this much bad luck even with horrible drives.
Similar Threads
-
Request for advice on how to handle situation with Steven (rack911)
By Isaac.Eiland-Hall in forum Dedicated ServerReplies: 3Last Post: 07-08-2006, 04:25 AM -
How do you handle this situation...
By thomas.smith in forum Running a Web Hosting BusinessReplies: 16Last Post: 07-01-2005, 10:36 AM -
What would you have done in a situation like this?
By nogi in forum Hosting Security and TechnologyReplies: 18Last Post: 03-05-2003, 05:42 PM -
What do i do in this situation?
By SuperDon in forum Running a Web Hosting BusinessReplies: 15Last Post: 12-01-2001, 06:06 AM -
What to do in a situation like this ?
By CRego3D in forum Web HostingReplies: 29Last Post: 11-03-2000, 03:06 PM