Web Hosting Talk







View Full Version : JaguarPC - restoring a server for more than 24 hours?


Master Bo
12-19-2007, 11:08 AM
Greetings,

I am a JaguarPC custoemr for more than 2 years now. Yesterday one of their servers my reseller account is hosted on went down - HDD failure.

OK, things happen, they started to restore data from a day-old backup. And here a mystery began. The restore procedure becomes very slow. Admins post the percentage - and it takes 3 hours to move from 98% to 99%.

The support doesn't respond anything clear, just 'wait it will soon be over'-like replies.

I am astounded - I was never treated that way and I have been thinking JaguarPC is one of the best hosting providers I knew.

Does anyone here has a similar experience with JaguarPC? Are such events rare or more or less regular?

What I really hare is no information from them. I can't believe it could take that much time to restore those data unless more problems appeared.

By the way, any wel-standing hosting providers our there with good support, moderate prices and long history of good operation?

Thank you.

Jag
12-19-2007, 11:22 AM
If your on a huge array and depending on the amount of data and level of compression, it can indeed take a while.

We're adding lots of backup space constantly and over the next many months will have more than tripled our space. That will allow us to use less compression which will help in restore times.

Master Bo
12-19-2007, 11:54 AM
The problem is I don't know whether I'm on a huge array. I am not given any clear replies.

I was told approx. 8hours ago - "we expect the server to be up in 4-5 hours". Now that it's still down, I am only offered to wait more.

For how much time more? A day, a week? I will son start losing customers, because I can only tell them "No information on ETA, we should wait". But JPC surely doesn't care about that.

The hosting provider is best tested with such hardware failures. Looks like JPC can't pass the test decently.

The JaguarPC, as Support told, is not responsible for such server downtime. I like that.

iHubNet-Matt
12-19-2007, 12:24 PM
This is a bad situation. I think it will take time to restore things completely, but not sure whether it will take this much.
May be they might had some unforseen issues, but it will be worth if updated. Anyway, only JPC will be able to update more on this.

Master Bo
12-19-2007, 01:59 PM
But they won't offer any updates.

Downtime is 28 hours now. Support replies only "read the forum for updates", the last update on the forum was 4 hours ago. The members that post to forum ignore PMs.

And, as I mentioned, ToS/SLA of JaguarPC doesn't oblige them to compensate customers in such situations.

Make the conclusions yourselves.

iHubNet-Matt
12-19-2007, 03:24 PM
It will be really frustrating to have a 28 hours down time. I hope some one from JPC may come with an update on why it is taking this much time.

Master Bo
12-19-2007, 08:13 PM
The downtime is 34 hours and looks like it will be measured in weeks. The last update said this:

"We apologize for the inconvenience that we were unable to give any update earlier. The data restore was completed but the server failed to boot up. We are able to see disk partitions, raid, and data all is safe. However the kernel panics on boot up. We are working as fast as possible to fix this new problem we are facing."

I wonder, how much time would take to take a new server, copy the intact data from the failed one (data are intact, no need to decompress them/whatever, just copy)?

I am afraid this will take forever and, of course, there will be no updates on the process. Just no information until whatever outcome occurs.

I am starting to look for alternate hosting provider. An irony is I can't effectively migrate, since all my data are stuck at JaguarPC.

And I started to lose customers. JFYI.

IH-Rameen
12-19-2007, 08:26 PM
This is a bad situation. I think it will take time to restore things completely, but not sure whether it will take this much.
May be they might had some unforseen issues, but it will be worth if updated. Anyway, only JPC will be able to update more on this.

Good job on another spam post :spamsign:

Back to the issue..

I think it's vital to take into account that you have been with this company for 2 years and I assume you have been satisfied to be with them that long.

Hosts are prone to problems and errors. Hard drives will fail, networks will fail - problems will happen. In such a situation you simply need to sit back and let JaguarPC do its job. Clearly they are working on it, and they have no reason to delay it on purpose.

The best thing you can do is wait patiently, even if it is going to take a long time. I know these situations are frustrating, but as a host, I know what it is like when problems arise, and people can work much more easily, efficiently and quickly when they don't customers complaining etc.

The important part to observe here is to see how your host handles the situation.

orfeo
12-19-2007, 08:35 PM
Just another update, it is still very bad.

Unfortunately I have already lost two clients and I believe that tommorow morning (in 7 hours) I will loose two or three more, once they go to their offices and realize that they don't have emails after almost 48 hours.

I know they are doing the best they can but communicating this to my clients is very difficult. Considering the server had some downtime (some hours) the previous month this makes me look bad.

I went to reseller accounts just to avoid this kind of long dowtime and here we go again... I always believed at the worst case scenario they can set up a new server in some hours from the backup. Not 2 days.

IH-Rameen
12-19-2007, 10:52 PM
I went to reseller accounts just to avoid this kind of long dowtime and here we go again..

A reseller account won't prevent this from happening. Neither would a VPS or dedicated. A hard drive failure can pretty much happen to any server. A reseller or VPS won't give you any more or less redundancy.. :agree:

Master Bo
12-19-2007, 11:18 PM
I think it's vital to take into account that you have been with this company for 2 years and I assume you have been satisfied to be with them that long.

Until this failure, there were no major disasters. Hosts are best tested by major disasters. JPC fails this test already.

Hosts are prone to problems and errors. Hard drives will fail, networks will fail - problems will happen. In such a situation you simply need to sit back and let JaguarPC do its job. Clearly they are working on it, and they have no reason to delay it on purpose.

Pray tell me how can I know they are indeed doing anything if time passes by and apart from scarce 'count up' I see no explanation?

The best thing you can do is wait patiently, even if it is going to take a long time. I know these situations are frustrating, but as a host, I know what it is like when problems arise, and people can work much more easily, efficiently and quickly when they don't customers complaining etc.

Oh, thank you for the great advice. Remind me to advise you something similar when one of your hosts fails in the same manner.

According to what JPC told in their forum

- their RAID on the Demetrius failed in such a manner that nothing could be recovered
- JPC techs installed new disks firmware before actually starting data recovery (?? why?? why introduce possible new problems before data are back?)
- their backup software is both inefficient in terms of restoring 200Gb of data and obscure in terms of the current restore progress
- if this was not their first total data recovery experience (and it must NOT be the first), they must have an assessment of how long will it be. Rough estimation. But they kept telling 'in 4-5 hours' every time they were asked after the previous promised interval was in the past.

Does the above look like a professional approach?

Is 38+ hours per server a normal time frame? I am a SysAdm myself and have experience with backup/restore process.

The important part to observe here is to see how your host handles the situation.

I already see that. And I am sure they aren't able to handle such failures in timely and decent manner.

PremiumHost
12-20-2007, 01:00 AM
50GB of cpanel full backup files will take at least 12 hours to complete the restore.

I don't think your website will be restored anytime soon if they don't have the backup data ready and still in recovery phase.

Master Bo
12-20-2007, 02:21 AM
50GB of cpanel full backup files will take at least 12 hours to complete the restore.

I don't think your website will be restored anytime soon if they don't have the backup data ready and still in recovery phase.

50Gb will take 12 hours or more on any server with any HDD type? Are you serious?

Such estimations should refer to exact server specs to be taken seriously.

The data are recovered (it took them more than 20 hours to complete that), now they are trying to make the server boot.

Citing their postings:

06:18 AM
It will be several more hours before the server is back online and when it is we will update with the relevant info.

11:52 AM
We're still working on this. A few more hours I think and we should have this back up.

No comments.

plumsauce
12-20-2007, 03:36 AM
50GB of cpanel full backup files will take at least 12 hours to complete the restore

That's about right.... for the 8 year old DAT3 drive I use to backup my desktop.

From my experience, restoring from backup is a load and wait operation. A long time ago, I used to start the restore, wait 5 minutes, use the 5 minute rate to figure out how long I could sleep, and let the operator take care of pushing the "next tape" button on the library at the appropriate intervals.

Surely the support tech could have posted a "X GB out of 50 GB restored" message every hour while waiting around.

Master Bo
12-20-2007, 03:51 AM
plumsauce. Indeed. In case I didn't express that duly: what makes me most irritated in the current situation is the fact I was given false information on when the issue will be resolved. Naturally, my customers became enraged when they realized the information has nothing to do with reality.

And yes, the actual progress indicator would be great to see. Alas, the software JPC uses to backup/restore isn't that smart and provides few useful pieces of information on the restore progress.

I only hope they (JPC) will change approach to solve such situations. Hardware has its lifetime, it will fail sooner or later, but don't feed me with false information and/or keep uninformed at all.

Asher S
12-20-2007, 04:17 AM
Large storage arrays, which are becoming increasingly popular as hosting companies oversell are indeed quite painful and slow to restore.

orfeo
12-20-2007, 06:36 AM
I mentιoned reseller because when you are on a reseller you know that your problem starts receiving attention minutes after it occurs. Backup is there, a lot of customers to take care. That's what I am saying and I don't think it is far from reality.

Restoring one server from a backup shouldn't take more than a full day. I am not refering to technical issues here. It SHOULDN'T take more. If it does, then the backup-restore procedure is not efficient from my point of view.

PremiumHost
12-20-2007, 08:32 AM
Restoring one server from a backup shouldn't take more than a full day. I am not refering to technical issues here. It SHOULDN'T take more. If it does, then the backup-restore procedure is not efficient from my point of view.
Depends on what kind of backup and how much data.
Restore a disk image backup will not take more than a few hours.
Restore cpanel backup file which guarantee everything working after restore will take much longer time.

Master Bo
12-20-2007, 08:42 AM
Depends on what kind of backup and how much data.
Restore a disk image backup will not take more than a few hours.
Restore cpanel backup file which guarantee everything working after restore will take much longer time.

If that means that hoster will spend days and weeks restoring one (1!) server data from backup - then the whole backup/restore system is inefficient. Period.

Master Bo
12-20-2007, 08:52 AM
Downtime is 47 hours.

The last update from admins is 7 hours old and it states:

We're still working on this. A few more hours I think and we should have this back up.

DeNasio
12-20-2007, 09:14 AM
Downtime is 47 hours.

The last update from admins is 7 hours old and it states:

We're still working on this. A few more hours I think and we should have this back up.

Nothing yet?

Master Bo
12-20-2007, 10:08 AM
Downtime 48 hours.

No update from admins. The server (demetrius.nocdirect.com) is pingable and it even responds to HTTP, displaying WHM/Cpanel default Web page.

No sites, however.

Master Bo
12-20-2007, 10:47 AM
A new update:

08:31 PM
Demetrius is up and most of you should already be seeing your websites up and running. However, some of the copying of data from the old disk is still in progress. Due to this, there will be some load on the server during this restoration process.
We will update you again once we have everything running perfect and back to normal.

I don't know about the most, but my sites are all broken. I wonder how much time will it take to "have everything running perfect".

IH-Rameen
12-20-2007, 10:47 AM
Pray tell me how can I know they are indeed doing anything if time passes by and apart from scarce 'count up' I see no explanation?


Well, what else will they be doing? Sitting at their desks laughing at everyones misfortune?


Oh, thank you for the great advice. Remind me to advise you something similar when one of your hosts fails in the same manner.


We don't host through someone else. We have our own servers and equipment. We have indeed been through hard drive failures and disasters. But we use low capacity drives so the restore is pretty quick.


As for receiving updates.. The problem can be such that a simple ETA cannot be provided. Or simply there is not much to update on..

I would agree that more frequent updates would be ideal, but it's not always possible.

Master Bo
12-20-2007, 11:15 AM
Well, what else will they be doing? Sitting at their desks laughing at everyones misfortune?

Your fantasy is richer than mine. The sitting and laughing theory didn't visit my mind.

To be serious: when they provide contradictory reports, it's hard to understand what's happening.

We don't host through someone else. We have our own servers and equipment. We have indeed been through hard drive failures and disasters. But we use low capacity drives so the restore is pretty quick.

This is the point: low capacity drives. One of customers of JPC has recalled that it was a similar crash 1-2 years ago and JPC promised to take measures so that such long-term downtimes would never happen again.

Looks like they didn't take measures since that incident.

As for receiving updates.. The problem can be such that a simple ETA cannot be provided. Or simply there is not much to update on..

I would agree that more frequent updates would be ideal, but it's not always possible.

Yes, the exact ETAs aren't always available. But what about rough ones? In 2 days, in 3 days? I could have installed my customers' sites on another reseller account if I knew the downtime will be that long. Instead, we were being told that "in a few more hours everything will be back".

Jame$
12-20-2007, 11:22 AM
I'm sure they will all fix it up in the end. The only thing I can really criticize Jag for in this case is the constant stalling you got, "a little more, a little more" when really it took days. That, is kind of dishonest poor support.

Master Bo
12-20-2007, 11:33 AM
The only thing I can really criticize Jag for in this case is the constant stalling you got, "a little more, a little more" when really it took days.

You are absolutely right. This is the only thing that drove me mad. Nothing else.

OK, I will post the final update when all my sites are working.

ldcdc
12-20-2007, 04:29 PM
This is the point: low capacity drives.Well, there tends to be a balance. If you want lots of space for relatively few $$, the host will have to use larger drives, which, in case of disaster, can mean longer (and sometimes significantly longer) restore times.

That being said, the trade-off is not necessarily linear.

Master Bo
12-20-2007, 07:13 PM
8 hours after the last admins' update. My sites are still broken.

Total downtime: 56 hours.

DeNasio
12-20-2007, 07:41 PM
Total downtime: 56 hours.

This is not good for business. But I'm sure JPC are doing their best.

Master Bo
12-20-2007, 09:32 PM
My sites are appearing online. Looks like for me the downtime's over.

Total downtime was 58 hours.

orfeo
12-21-2007, 06:40 AM
I also have all my sites online. It was some very bad 3 days.