Web Hosting Talk







View Full Version : uptime - downtime


Jason Ellis
07-04-2000, 12:06 AM
I don't know what the most common causes of downtime is for Unix hosts, but on NT (in my experience) I'd have to say the single most common cause of downtime is poorly written scripts over-running the server. Whether those scripts are ASP (most common) or Perl (possible), NT's built-in safeguards will detect the over-run, and to protect the server (and prevent the entire operating system from crashing), NT will quickly and quietly shut off the web server software.

That is an effective safeguard - it knocks the offending script offline instantly and brings the CPU back under control. Unfortunately, it also has the effect of shutting down all the web sites on the server.

When this happens, a SysAdmin needs to log into the server and restart web services. If the host has someone on-duty 24/7, then the downtime probably won't be more than 5 minutes. On the other hand, if the host only has techs on during business hours, well... you can see the result.

However, I will say that even with this issue being fairly common, 99.9% uptime is *easily* sustainable if you have common sense, implement the proper monitoring systems, and have staff prepared to deal with such emergencies.

Jason

------------------
Jason Ellis, CEO
Hosting Solutions, Inc.
www.windowswebhost.com (http://www.windowswebhost.com)
Now offering Fully Managed Servers!

Jason_Berresford
07-04-2000, 12:09 AM
"There are 720 hours in a 30 day month (744 for a 31 day month). 99.9% uptime would still leave you with about 7 hours of downtime per month based on a 30 day month."

Just my experiences, however 7 hours downtime is a lot more then we experience. Its not that hard to have less then 1 hour downtime a month.

The problem occurs when hosts don't have the equipment/staff to repair any large problems/small problems that come up. In my opinion the only type of downtime that should be experienced is downtime related to server upgrade/routers etc.

However this isn't a perfect world, but to stay under 7 hours, even 3 hours a month of downtime is not all that hard.

------------------
[ Jason Berresford | Admin]
[ http://www.can-host.com ]
[ Admin@can-host.com ]
[ (905)765-8140 ]

Learner
07-04-2000, 09:08 AM
would someone please explain why hosting companies suffer from DOWNTIME ?

and how much UPTIME is to be practically expected from a good hosting company ?

i often read 99.9 % UPTIME... something tells me this is just BS !!! am i right ??!!!!

on the other hand, there are some companies who state 90% or 97% etc etc etc.

i wish to know what "downtime" is really due to.

Thomas Kangas
07-04-2000, 09:58 AM
>would someone please explain why hosting >companies suffer from DOWNTIME ?

Downtime can be attributed to several factors including server crashes, router problems, or hardware/software glitches.

Another aspect of server downtime is planned maintenance. Normally, this is downtime that is announced, and is ususally for the purpose of upgrading hardware, operating system or other vital software to the server

(Windows NT 4 is notorious for reboots caused by software installs).


>and how much UPTIME is to be practically >expected from a good hosting company ?
>
>i often read 99.9 % UPTIME... something >tells me this is just BS !!! am i >right ??!!!!

I disagree with this somewhat. 99.9% is quite within the realm of achievable. Consider this:

There are 720 hours in a 30 day month (744 for a 31 day month). 99.9% uptime would still leave you with about 7 hours of downtime per month based on a 30 day month.

I work primarily as a Unix/NT/WAN Sysadmin for a fairly large government entity. If my network were to suffer 7 hours of downtime in a month, it would be unheard of.

Conversely, I also resell NT hosting (I am most proficient with NT), and the company I resell for has suffered from what I can tell about 12 hours of downtime (for my site and all of my clients) in the last 2 years.

I have not seen too many hosts that can provide that type of stability (although I have been with them primarily the whole time I have had my site).

So yes, 99.9% is quite obtainable, but it requires the company to invest in good hardware, connectivity, and most importantly very competent sysadmins.

>on the other hand, there are some companies >who state 90% or 97% etc etc etc.

For a personal site, 90% uptime is probably at the low end of acceptable (if acceptable at all). I could not fathom having 21 hours of downtime a month. Most E-commerce sites would lose a ton of income from that kind of average downtime IMO.

>i wish to know what "downtime" is really >due to.

More times than not, it seems that downtime is related more to hardware crashes/lost connectivity than anything else, although I cannot vouch for the hosts out there. In the case I described above the lost downtime for my site and the sites of my customers was caused by buried Fiber Optic cables that got cut.

I am sure the hosts out here could probably give their take on what is the most common cause of downtime.

Hope this helps,

Sincerely,

Thomas A. Kangas

Dave
07-04-2000, 02:27 PM
Downtime can also have allot to do with the rout between you and the server, the best way to tell why you are down is to run a trace route while you are down. There have been many times that I receive a complaint from a client saying the server is down but they are the only ones that can not connect with it. It all has to do with routing, and some routers are stupid and they can take awhile to find a another route to the server. These I have found are the downtimes that only last 10 or 15 minuets. If the route between you and the server has been broken it may take the routers some time to figure out the next best route to the server. We are on the Alabanza network and have been for some time, and I have to say that they are up 99.9% of the time. So next time you go down do not be in such a big hurry to assume the server is down, do a trace route and that will show were the route breaks off. I do know that allot of servers do go down allot, but if you are on a reliable server chances are it is a routing problem not the server.

JRC Systems

Dave
07-04-2000, 02:36 PM
Here is a link to a free trace rout program that is very good.
http://www.analogx.com/contents/download/network/htrace.htm


JRC Systems

Duster
07-04-2000, 03:15 PM
So true, Dave. The great majority of people on the Internet haven;t got a clue as to how it works so they assume if they can't connect to a site, it must be the fault of the site. For that reason, I added a page called Internet Issues and FAQ to my site over a year ago, and now have it on my server site as well. It explains, in simple terms, some of the reasons there may be problems in connecting and how one can check.

It has cut down on questions about connectivity in my discussion forums and serves as a quick reference for the occasional few ones that pop up. It's easier to refer people to the page than answer their questions with short answers.

The page I'm referring to is at http://techcellence.net/internetfaq.htm It has a link at the bottom to a much more detailed coverage of the Internet and its origins.

Deb
07-04-2000, 04:52 PM
WOAH!!!! Careful all. There is a major flaw in the following math:
There are 720 hours in a 30 day month (744 for a 31 day month). 99.9% uptime would still leave you with about 7 hours of downtime per month based on a 30 day month. This is just erroneously incorrect...

Firstly, I personally believe it's better to go by the minute since 15 minutes of downtime is serious as well as an hour of downtime. I'm often offended by hosts that will calculate downtime by the hour simply because it protects them from 20 and 30 minutes down as well as by hosts who wont calculate it at all unless they are 'forced' to by a site owner but that's another issue...

At any rate...

Going off of a 30 day month there are 720 hours which is equal to 43,200 minutes. Now for the math:

43,200 * 0.999 (99.9%) = 43,156.8 minutes.
43,200 - 43,156.8 minutes = 43.2 minutes.
So this allows the host to have up to 43.2 minutes of downtime while remaining within 99.9% uptime. An obvious far cry from 7 hours!

The following shows Uptime(ut) = An allowance of up to x minutes(m) downtime(dt).

99.9% UT = 43.2m DT (less than 1 hour)
99.8% UT = 86.4m DT (1.44 hours)
99.7% UT = 129.6m DT (2.16 hours)
99.6% UT = 172.8m DT (2.88 hours)
99.5% UT = 216m DT (3.6 hours)
99.4% UT = 259.2m DT (4.32 hours)
99.3% UT = 302.4m DT (5.04 hours)
99.2% UT = 345.6m DT (5.76 hours)
99.1% UT = 388.8m DT (6.48 hours)
99.0% UT = 432m DT (7.2 hours)

There is just way too huge of a difference... As you can see a 99.0% uptime guarantee would allow for the 7 hours but a 99.9% uptime guarantee certainly would not.

Deb

- Http://www.FutureQuest.net/

<EDIT> For a personal site, 90% uptime is probably at the low end of acceptable (if acceptable at all). I could not fathom having 21 hours of downtime a month.

98% UT = 14.4 hours allowed downtime.
97% UT = 21.6 hours allowed downtime
96% UT = 28.8 hours allowed downtime.
95% UT = 36 hours allowed downtime
94% UT = 43.2 hours allowed downtime.
93% UT = 50.4 hours allowed downtime
92% UT = 57.6 hours allowed downtime.
91% UT = 64.8 hours allowed downtime
90% UT = 72 hours allowed downtime

</EDIT>

A decimal is a terrible thing to waste.



[This message has been edited by Deb (edited 07-04-2000).]

Thomas Kangas
07-04-2000, 08:20 PM
DOH!!!


Good Catch Deb. I miscalulated. My appologies to the board.

Sincerely,

Thomas A. Kangas