Web Hosting Talk







View Full Version : Data loss nightmare with RackShack.net


pmak0
01-20-2002, 01:09 PM
At about 1am, my machine had a very high load average (over 100!) for some reason. I suspect that either Apache or AdCycle chewed up all the available memory (the machine gets over 250,000 hits per day), although I never managed to get a "top" display to check.

After 15 minutes of trying to "su" (I got as far as the "Password:" prompt but it just sat there), I asked RackShack for a reboot:

Problem Description:
1/20/02 1:26:50 AM
1:39am up 20 days, 15:29, 14 users, load average: 168.51, 139.18, 82.77
Machine is frozen. Please reboot.

I quickly got back a response that the machine was rebooted.

Resolution Description:
1/20/02 1:30:34 AM
rebooted.

However, the machine did not come back up when it was rebooted. Three hours later, I finally got this response:

1/20/02 4:27:16 AM
server has some strange behavior. customer has installed the LILO loader and what looks like a recompiled kernel. It does not seem to finish booting and it returned some I/O HD errors the last boot. Possible bad drive, definite restore.
1/20/02 4:31:03 AM
We will need authorization to do a restore and replace the drive. Let us know immediately whether you wish to procede with this. We will not be responsible for any restoration of content.

The part about the kernel seems to be irrelevant (BTW it was a reconfigured kernel, not a recompiled one) since the machine had been running fine since the beginning of the month.

I'm thinking that with a load average of over 100, there were many files open that were not properly closed that caused the disk to be corrupted when they rebooted the machine.

So now, I have to wait for them to restore the server. I can't even get my e-mail right now since it was on that server. I have a 20 hour old backup of the server, but it only has data so I have to recompile all the programs, which will take me a whole day. :(

Anyway, the reason I posted about this was to ask you guys: Do you think that RackShack was at fault for this (for purchasing bad hardware? for not being able to recover the hard drive?)?

Ales
01-20-2002, 01:58 PM
Sorry to hear about your misfortune but IMHO, rackshack is not to be blamed here... What for?

Anyway, I hope you get your server back in shape as soon as possible.

Ales

pmak0
01-20-2002, 04:29 PM
Man, this is kind of funny in a morbid sort of way:

1/20/02 12:08:54 PM
Pulling server for restore
1/20/02 12:16:09 PM
This is not the first restore on this server please bill customer and return to Data Center.

davidb
01-20-2002, 04:38 PM
Rackshack is not to blame here. Very sad what happend, but it does happen. Hardware fails all the time and this time it just happend to you. Right now I would recomend that upgrade and have the drives mirrored, and hopefully stop it from making you restore :(

Pilgrim
01-20-2002, 05:43 PM
Originally posted by pmak0
please bill customer and return to Data Center.

Well since it isn't their fault and it is your server you will have to pay them for their time.

That doesn't make it any sweeter though. I suspect you had other plans for your money :(

It's life. We all have to live through it ;)

NyteOwl
01-20-2002, 06:15 PM
Those are some load averages. Looks like a runaway process (or two).