Web Hosting Talk







View Full Version : What are good LOAD AVERAGES for a host?


chrisb
09-17-2002, 11:21 PM
Using the "uptime" command, my host shows load averages of 4.65, 4.55, and 4.95.

Is that...

1. excellent
2. above normal
3. normal
4. below normal
5. lousy

johnallen
09-17-2002, 11:22 PM
Below 1.0

[edit] also run top and see what process is using so much resources.

ChickenSteak
09-17-2002, 11:25 PM
Our server's for managed shared solution's stay around this load average: 0.04, 0.30, 0.27. We use p4 2GHz server's with 1gb ram. So yes I would say below one is what it should be to be considered the best.

RackNine
09-17-2002, 11:25 PM
Depends on the server. You probably want a load average below 2.5, but 4.5 wouldn't be all that bad if the server had multiple processors and a bunch of RAM.

Sincerely,

-Matt

Akash
09-17-2002, 11:27 PM
Originally posted by RackNine
Depends on the server. You probably want a load average below 2.5, but 4.5 wouldn't be all that bad if the server had multiple processors and a bunch of RAM.

Sincerely,

-Matt

agreed, i've even seen servers (namely at another very popular host) with avgs between 6-10 and are still able to handle it..

chrisb
09-17-2002, 11:28 PM
Originally posted by johnallen
Below 1.0

[edit] also run top and see what process is using so much resources.

So, I take it that load averages around 5 are lousy. I thought they should be less than 1, but didn't remember.

What causes them to be so high? Too many users on a server?

I'm off to run the top command now to see if I can see what's happening.

phpcoder
09-17-2002, 11:31 PM
chrisb, it depends on the server specs.... do you have them?

johnallen
09-17-2002, 11:34 PM
yeah.. top will show what processes are running and how much cpu / mem it's using.

Aussie Bob
09-17-2002, 11:46 PM
5. lousy :eek:

Aussie Bob
09-17-2002, 11:47 PM
Originally posted by chrisb


So, I take it that load averages around 5 are lousy. I thought they should be less than 1, but didn't remember.

What causes them to be so high? Too many users on a server?

I'm off to run the top command now to see if I can see what's happening.
Top from root and then "k" the hogging processes. ;) :D

chrisb
09-17-2002, 11:50 PM
I can't get the top command to work, but here are the server specs.

Server Information

Processor Info
Processor #1 Vendor: GenuineIntel
Processor #1 Name: Intel(R) Pentium(R) III CPU family 1266MHz
Processor #1 speed: 1266.098 MHz
Processor #1 cache size: 512 KB

Processor #2 Vendor: GenuineIntel
Processor #2 Name: Intel(R) Pentium(R) III CPU family 1266MHz
Processor #2 speed: 1266.098 MHz
Processor #2 cache size: 512 KB

Memory Information
Memory: 3996600k/4063168k available (1284k kernel code, 66180k reserved, 376k data, 216k init, 3145664k highmem)

System Information
Linux jupiter.net 2.4.18 #1 SMP Tue Jun 25 10:49:31 MDT 2002 i686 unknown

Physical Drives
hdc: CDU5211, ATAPI CD/DVD-ROM drive
hdc: ATAPI 52X CD-ROM drive, 120kB Cache, UDMA(33)
Current Memory Usage
total used free shared buffers cached
Mem: 3997096 3809452 187644 0 254388 2573912
-/+ buffers/cache: 981152 3015944
Swap: 1052216 130816 921400
Total: 5049312 3940268 1109044
Current Disk Usage
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 2.0G 197M 1.6G 11% /
/dev/sda1 38M 9.2M 26M 26% /boot
/dev/sda9 306G 11G 280G 4% /home
none 1.9G 0 1.9G 0% /dev/shm
/dev/sda7 29G 9.1G 18G 33% /usr
/dev/sda6 29G 2.7G 24G 10% /var

insiderhosting
09-18-2002, 12:27 AM
Chris,
That is a dual proc server so a load of 5 while not being great isn't really that bad, but it could be slightly better.

Use this command to see how many sites are on the box ls /home |wc -l

-Steven

Aussie Bob
09-18-2002, 12:47 AM
Originally posted by insiderhosting
Use this command to see how many sites are on the box

Ummmmmm -
/dev/sda9 306G 11G 280G 4% /home
And that has an exact effect on the load of the server, how??

Oh no, not this argument again. ;) :rolleyes:

UmBillyCord
09-18-2002, 12:50 AM
When was this taken? If this was at 2 AM on a Saterday morning PT, then I would say this isn't the best. If this was taken at 2 PM on a Monday, then this would be fine. This server is pretty beefy in terms of RAM and CPU.

RackNine
09-18-2002, 12:53 AM
Originally posted by chrisb
I can't get the top command to work, but here are the server specs.
Dual 1.2Ghz with 4GB ram? Unless you're running cpu-intensive scripts you'll likely not notice the load.

-Matt

chrisb
09-18-2002, 12:59 AM
ls /home lwc -l returned nothing, but
ls -l /home lwc OR
ls /home -l lwc
RETURNED
441 3964 28369

AND
ls /home lwc
RETURNED
440 440 3699

So, does that means there are 440 sites on the same server?

This was taken at approx 4 AM CST on a Tuesday morning. I have noticed cgi-scripts executing a little slowly. Now the load averages are around 3 to 3-1/2.

insiderhosting
09-18-2002, 01:02 AM
Originally posted by Aussie Bob


Ummmmmm -

And that has an exact effect on the load of the server, how??

Oh no, not this argument again. ;) :rolleyes:

Hey bob,
Not sure I understand your question, the more sites on the server the more of a load it will cause, especially those that use cgi scripts or other resource intensive scripts.

-Steven

AussieHosts
09-18-2002, 01:13 AM
Originally posted by Aussie Bob


Ummmmmm -

And that has an exact effect on the load of the server, how??

Oh no, not this argument again. ;) :rolleyes:

You still don't believe that the chances of 1 in 500 sites overloading a server are greater than 1 in 250 do you Bob? ;)

Gary

UmBillyCord
09-18-2002, 01:18 AM
Originally posted by Editor


You still don't believe that the chances of 1 in 500 sites overloading a server are greater than 1 in 250 do you Bob? ;)

Gary

Even though I am American, I have traveled and worked throughout Australia. So when someone here stated in another thread Bob was a Queenslander, it really explained a lot. :D

Webdude
09-18-2002, 01:20 AM
Believe it or not, I have seen load averages hitting short bursts up to 300% before everything started crashing. I use certain scripts that are meant to crash my servers during testing prior to use. Generally, a server using top of the line hardware and good software can do a sustained 85% for quite a while.

Before anyone says a server cant hit a load of 300%, I can give a 4 line script that can prove you wrong. I would also say your prob not a very good admin for not exploring ways people can take your machine down so that you know how to stop them. I'm sure I'm not the only one here who knows such codes. The others here that know them, know I am correct..

Load average will depend on various things such as time of day, etc. There also wannabe programmers who shouldnt be writing scripts, who can cause high server loads with amazingly simple scripts. Be surprised how many of them dont know what a die statement is. Poorly written scripts can continue to loop, forking, "add other terms here", etc, until they use all the ram or too much cpu for too long of a time. It's not unusual to log in and see a load average of 20%, but usually it shouldnt be above 5 for a dual cpu. Anything sustained above that, and the server is basically running off of adrenaline and wont be able to hold it. Like anything, the higher the number, the shorter time it can be sustained.. You could basically compare it to the RPM's of a vehicle engine. Probably a much better analogy, but the best layman's analogy I could think of right now..LOL

AussieHosts
09-18-2002, 01:26 AM
Yes, we would mostly all have process and load monitors/watchers in place, to detect/kill/alert what's happening where...but in the same way as one domain/script can bring a server to its knees, one of the first considerations must be how many domains/subdomains are going on to a server.

Cheers

Gary

Webdude
09-18-2002, 01:31 AM
At the same time, you can have only 5 domains hosted, and one bring it to it's knees. Then again, you could have 500, and never have a prob.

chrisb
09-18-2002, 01:32 AM
Webdude, what does the time of day have to do with load averages? It seems that no matter what time of day it is, there is something causing it.

Webdude
09-18-2002, 01:37 AM
Time of day means that 2PM will have more people accessing the server and user's scripts than at 2AM

(Usually) :D That depend on where the primary traffic for that server is coming from. If it's primary traffic is from the U.S. Then anywhere between noon and 8PM (Pacific to Eastern time) could be that machine's highest load time. But if the load is high "ALL" the time, then there's a prob. That would mean there is at least one script that's gone haywire. Running top is the best way to find that, but I believe you said you couldnt run top, right?

AceWeb
09-18-2002, 01:40 AM
I would not host with someone if their constant serverload is more than 1.0. From time to time it spikes and it is fine, but normally, it should not be more than 1.0

chrisb
09-18-2002, 01:42 AM
What I meant was that the time of day is irrelevant to the cause of the load average. Does that make sense?

Anyhow, can someone please look at my ls -l /home wsc RESULTS and tell me how many users are on the server with me? Thanks.

AussieBob: Thanks for the info. Just curious, what does your load avg run?

Webdude
09-18-2002, 01:44 AM
ls -l /home doesnt mean anything. That just shows how many directories are in there, not actual accounts. If you can run that command, then you should be able to

ls -al /home (which means a poorly secured server if you can)

chrisb
09-18-2002, 01:51 AM
Originally posted by Webdude

ls -al /home (which means a poorly secured server if you can)

Let's not go there again. :)

Webdude
09-18-2002, 01:54 AM
Originally posted by chrisb


Let's not go there again. :)

LOL! Well in this case, it's good for your situation :stickout

netacore
09-18-2002, 01:58 AM
You are on cPanel I presume ...

Within WHM, what does 'Cpu/Memory/Mysql Usage History' under Server Status yield? -- who/what are the big consumers?


Originally posted by chrisb
I can't get the top command to work, but here are the server specs.

Server Information

Processor Info
Processor #1 Vendor: GenuineIntel
Processor #1 Name: Intel(R) Pentium(R) III CPU family 1266MHz
Processor #1 speed: 1266.098 MHz
Processor #1 cache size: 512 KB

Processor #2 Vendor: GenuineIntel
Processor #2 Name: Intel(R) Pentium(R) III CPU family 1266MHz
Processor #2 speed: 1266.098 MHz
Processor #2 cache size: 512 KB

faculty
09-18-2002, 02:10 AM
Yes it does.. and then you can see what scripts take the most resource.

What I do when this is a problem, is ask the client to "cut up" his/her scripts into more, smaller ones.


This brings down server load VERY much, and in your Terms and Conditions you should state that an account hoging resources and ruining the service for others can be removed or suspended until the hogging scripts are "cut up"..

Works well :)

Jedito
09-18-2002, 02:10 AM
If it was take on a server running WHM/Cpanel near 1 AM, the load could be caused by the backup script.

BTW, that load its not bad for a dual processor with 4 GB RAM.

chrisb
09-18-2002, 02:13 AM
The load average now at 1:15 AM is up to around 6.5.

faculty
09-18-2002, 02:14 AM
Could be backup scripts and whatever.. they always bog down a server for a while :angry:

Jedito
09-18-2002, 02:19 AM
The backup script tend to rise the load

if you can do a ps aux you'll see some process like this

root 27801 4.0 0.0 1584 712 ? S 00:20 0:11 tar zcf
root 27802 89.8 0.0 1716 664 ? R 00:20 5:19 gzip

chrisb
09-18-2002, 02:20 AM
So, I have 440 users on the same server with me, and a load average of around 6.5. I think it does affect me because my scripts run slow, and I mostly work on my site in the wee hours of the morning.

However, I don't think I have anything running in the background, and no foreground scripts are running. But, how do I know if it's me or another user causing it?

Is it something I should ask my host to fix? I doubt if they will oblige, though.

mdrussell
09-18-2002, 02:42 AM
Even though it's a poweful box, it's probably a little overloaded - http requests will probably still be pretty quick, but as you mentioned with your scripts, you might find a lag time as they demand the CPU time.

A load average of 4-5 at peak times would be ok, but sustained that raises even higher is not good.

I'd speak to your host and mention that you think the server may be overloaded.

chrisb
09-18-2002, 02:49 AM
Thanks Jedito, I checked my process with the ps -u command and I'm using 0% memory, just as I thought. What specifically should I look for when I do ps aux? That returns a bunch of info.

AussieHosts
09-18-2002, 02:55 AM
Originally posted by chrisb
Is it something I should ask my host to fix? I doubt if they will oblige, though.

They may, but the situation can get out of hand. Drives fill up with unlimited domains/subdomains, loads shoot up, and then a host is faced with trying to shuffle sites out amongst servers to find a happy medium. Clawing back a few percent of space across different servers.

Good luck with it though.

Cheers

Gary

Paul L.
09-18-2002, 02:56 AM
I think everybody is missing the a big point and that is how much idle CPU % is avalible, top means nothing servers with this load can run fine untill the idle cpu is all used up then you have problems.

I have seen many servers with loads more than this and have 40% to 50% of each cpu idle.

Also many things can count for higher loads such as backups running and if the server is still taking accounts there is alot more activity going on such as FTP and so on than on a server that is not in production taking on accounts.

baileysemt123
09-18-2002, 03:03 AM
chrisb,

A load number is only a part of the puzzle and is rather arbitrary.

As an example, here is a recent "top" off a similarly configured server:

11:28pm up 12 days, 3:39, 3 users, load average: 4.11, 3.91, 3.79
383 processes: 356 sleeping, 4 running, 18 zombie, 5 stopped
CPU0 states: 16.1% user, 21.1% system, 0.0% nice, 62.2% idle
CPU1 states: 13.3% user, 17.3% system, 0.0% nice, 68.4% idle

You can see, the load is between 3.79 and 4.11, and yet the CPU's are 62-68% idle. This was at 11:28 p.m. Mountain time, so 12:28 a.m. central time.

It's idle time that is a much truer measure of "load," not the load number that the server spits out. This is especially the case with a high-powered server, like the one you describe, as well.

I know that similar servers operate with no performance degradation at loads anywhere from 2-5. This particular server is operating at peak right now with no issues.

chrisb, any updates? what's your server at now? Enough time has passed, I would think that most back-ups would be done, but yes, I have worked on a # of servers where backups temporarily drove the load up to 8 or 10, especially if a few big ones are scheduled all at once. They dropped within a few minutes. No harm done. :)

:) Anyways just a couple thoughts.


:D Bailey

*edited for grammar & stuff

Aussie Bob
09-18-2002, 03:47 AM
Originally posted by UmBillyCord
Even though I am American, I have traveled and worked throughout Australia. So when someone here stated in another thread Bob was a Queenslander, it really explained a lot. :D
:stickout And next time you're down this way UBC, stop in for a drink. :D I'll even shout!! :)

This is not the thread for "the more domains a server has, then the more load the server will experience". I have a friend who has 1 domain on his server and he's flat out keeping loads under 4. Very heavy CGI etc. :stickout

Aussie Bob
09-18-2002, 03:51 AM
Originally posted by Editor
You still don't believe that the chances of 1 in 500 sites overloading a server are greater than 1 in 250 do you Bob? ;)

Gary
Depends what those 250 sites are doing? You can't generalise.

Ohhhh, so now you say "chances". lol :stickout

Just rounding off the odds there hey mate. ;)

AussieHosts
09-18-2002, 03:53 AM
Originally posted by Aussie Bob
This is not the thread for "the more domains a server has, then the more load the server will experience". I have a friend who has 1 domain on his server and he's flat out keeping loads under 4. Very heavy CGI etc. :stickout

That's a different situation again though Bob. The number of domains on a box will directly increase the chances of higher loads/poorer performance in a "shared" environment. Unless you plan on putting a $10/mth client on a server and if need be, only limiting it to him if he needs all the resources available. :)

Cheers

Gary

AussieHosts
09-18-2002, 03:57 AM
Originally posted by Aussie Bob
Depends what those 250 sites are doing? You can't generalise.

Likewise you can't just say that one site might cripple a box, so inevitably the total number of irrelevant. It defies logic mate.

Cheers

Gary

Aussie Bob
09-18-2002, 03:57 AM
Originally posted by Editor
Yes, we would mostly all have process and load monitors/watchers in place, to detect/kill/alert what's happening where...
Are you alerted by high loads? Do you go in manually and kill the processes??
but in the same way as one domain/script can bring a server to its knees, one of the first considerations must be how many domains/subdomains are going on to a server.
Also maybe what you have promised those sites can use as far as resources too. :D Another discussion for another day. ;)

AussieHosts
09-18-2002, 04:04 AM
Originally posted by Aussie Bob
Are you alerted by high loads? Do you go in manually and kill the processes??

Yes at 2.0 across the board. It's rare that we hit it.

Some processes will be killed automatically depending on the circumstances.

Cheers

Gary

Servstra-Sales
09-18-2002, 04:32 AM
We have a similar script we have installed on our server which we had a programmer whip up for us. It works like a treat.

chrisb
09-18-2002, 04:48 AM
Originally posted by baileysemt123
chrisb,

A load number is only a part of the puzzle and is rather arbitrary.

As an example, here is a recent "top" off a similarly configured server:

11:28pm up 12 days, 3:39, 3 users, load average: 4.11, 3.91, 3.79
383 processes: 356 sleeping, 4 running, 18 zombie, 5 stopped
CPU0 states: 16.1% user, 21.1% system, 0.0% nice, 62.2% idle
CPU1 states: 13.3% user, 17.3% system, 0.0% nice, 68.4% idle

You can see, the load is between 3.79 and 4.11, and yet the CPU's are 62-68% idle. This was at 11:28 p.m. Mountain time, so 12:28 a.m. central time.

It's idle time that is a much truer measure of "load," not the load number that the server spits out. This is especially the case with a high-powered server like Joust, and like the one you describe, as well.

I know that similar servers operate with no performance degradation at loads anywhere from 2-5. This particular server is operating at peak right now with no issues.

chrisb, any updates? what's your server at now? Enough time has passed, I would think that most back-ups would be done, but yes, I have worked on a # of servers where backups temporarily drove the load up to 8 or 10, especially if a few big ones are scheduled all at once. They dropped within a few minutes. No harm done. :)

:) Anyways just a couple thoughts.


:D Bailey

*edited for grammar & stuff

Thanks for all of that info, Miss Bailey. Why are 18 zombies running, though?

Anyhow, the last uptime I did showed
2:10 am MT load average: 6.73 5.03 4.42

Speaking of load averages, I wonder what WHT's are. I cannot connect much of the time. I wonder how many connections at a time WHT can handle?

baileysemt123
09-18-2002, 05:15 AM
chrisb> how is it now, if it had a midnight or a 1 a.m. back-up running, that should probably be done now?

Here's an example top off the same server I posted about earlier, now that back-ups are done.

3:06am up 12 days, 7:17, 3 users, load average: 0.87, 1.63, 2.14
376 processes: 351 sleeping, 1 running, 19 zombie, 5 stopped
CPU0 states: 20.1% user, 8.2% system, 0.0% nice, 71.0% idle
CPU1 states: 12.3% user, 7.3% system, 0.0% nice, 79.3% idle
Mem: 3997096K av, 3970452K used, 26644K free, 0K shrd, 142824K buff
Swap: 1052216K av, 242820K used, 809396K free 2679352K cached



3:18am up 12 days, 7:29, 3 users, load average: 0.73, 1.62, 2.00
370 processes: 340 sleeping, 4 running, 22 zombie, 4 stopped

CPU0 states: 23.2% user, 28.2% system, 1.1% nice, 47.4% idle
CPU1 states: 21.2% user, 28.3% system, 0.5% nice, 50.0% idle
Mem: 3997096K av, 3979252K used, 17844K free, 0K shrd, 149724K buff
Swap: 1052216K av, 242564K used, 809652K free 3007940K cached

Anyway so now out of curiosity I have been watching the server for the last 2 hours and I have found that with backups off, the server is less "sensitive" to user activity -- and the peaks come down much more quickly than they did with backups running as well. There will be peaks, that is normal... I have watched a server go from <1 to 5+, and back down to 1.5, in literally 30 seconds. And it kept right humming. It is a constant state of flux and at every moment, there are different operations kicking in and kicking out, both for users but also as part of the server's normal operations.

I should have posted earlier, I checked on my primary box and backups were taking 10-35% of CPU time, but I was called away and lost the cut & paste. :)

*yawn* I gotta go, boys... it is soooooo past this granny's bedtime. :)

:D Bailey

baileysemt123
09-18-2002, 05:20 AM
Speaking of load averages, I wonder what WHT's are. I cannot connect much of the time. I wonder how many connections at a time WHT can handle?

What a thought, indeed! :D

Aussie Bob
09-18-2002, 05:29 AM
Originally posted by chrisb
Speaking of load averages, I wonder what WHT's are. I cannot connect much of the time. I wonder how many connections at a time WHT can handle?
Yeah, good question. :D

headsurfer, chicken - can someone faciliate this request?? :)

Although isn't WHT running accross a cluster or something...?

chrisb
09-18-2002, 05:49 AM
at 4:34 CST, the load average from 3 checks is 1.8 to 2.2

Good nite, Granny. :)

I'm about to hit the sack too within the next hour.

It took me 25 minutes to connect to WHT. Cuss, cuss, cuss! :)

chrisb
09-18-2002, 05:56 AM
BTW, WHT's time is about 16 minutes slow.

Paul L.
09-18-2002, 06:11 AM
Chris a zombie process is already dead; you cannot kill it more :-)
The only way to remove a zombie is to reboot.
It should not be using system resources (except for a slot in the zombie process table).

chrisb
09-18-2002, 06:23 AM
Thanks, Paul.

UmBillyCord
09-18-2002, 11:45 AM
Bailey, you should have changed the "up 12 days" to "up 120 days" to give your point more validity. :)

chrisb
09-18-2002, 03:08 PM
2:09 PM load average: 6.26, 5.37, 5.23

...hmmm I guess the high load average last nite (early morning hours) wasn't caused by backups after all.

ADEhost
09-18-2002, 03:33 PM
Originally posted by Editor
Yes, we would mostly all have process and load monitors/watchers in place, to detect/kill/alert what's happening where...but in the same way as one domain/script can bring a server to its knees, one of the first considerations must be how many domains/subdomains are going on to a server.

Cheers

Gary

Hello Gary, we have a a wide difference of views here.

this is based on windows since I don't even have enough unix accounts to think about hard core tuning


I believe that a server should be shut from signup's once the average cpu usage hit 40% with spike loads of 70% ( which ever comes first )

we got servers with only 20 domains on it ( closed )and now I am happy to say that there is one server with 221 domains (closed)

I really don't understand the reason why everyone ask "how many domains" if you tune the server right ( for the given applications) then you can proceed up to a responsible number, no mater what, I would never go to 400 domains, but I might try it to test it with users ok's, and then move most of the domains to a new server.

I just retuned a cold fusion server. dropped the cpu usage from 21% to 4%. the clients have sent e-mails noticing the speed increase.

I'm now spending the week studying ASP.NET tuning ( what a new monster ) and I fear the day I have to learn JSP tuning.


Here is a UNIX tuning trick that I used for my e-mail server.

I changed the block size from 1024 to 4096 ( requires a reformat ) , got a gain of over 25% in read and write i/o. The cost was that disk usage size of file are bigger ( about 10% ).

that trick does not work in windows ( damm shame to )

Mike

Webdude
09-18-2002, 03:38 PM
Originally posted by chrisb
2:09 PM load average: 6.26, 5.37, 5.23

...hmmm I guess the high load average last nite (early morning hours) wasn't caused by backups after all.

If there are that many accounts on the server, the backups could very well still be running. Cpanel's method of backups isnt too efficient..

By the way, what happens when you try to run top? Permission denied?

baileysemt123
09-18-2002, 03:38 PM
Well now Chris you don't know that. :)

Remember you're just looking at a number, you have no idea to the cause behind those numbers.

A load of 6 last night could have been caused by a backup plus 30 misc. processes.

A load of 6 today could be caused by 900 misc. processes.

In the end all you see is 6. ;)

Think of it this way... someone gives you two identical sized boxes wrapped in identical gorgeous wrapping paper. One box contains a $1 bill and one box contains a $100 bill. Which is which?

You can't tell. They weigh the same, they look the same.

A 6 is a 6, but just because it was 6 twelve hours ago doesn't mean that it's the same cause 12 hours later. How much free CPU time does the server have right now? That will tell us much more than "6." :)

:cool: Anyways just somethin' to chew on. ;)

:D Bailey

Webdude
09-18-2002, 03:55 PM
Truth be told, one of our servers has 464 accounts on it.
Right now is our prime load time, and the load averages are as follows::

load average: 0.61, 0.64, 0.55
CPU0 states: 17.0% user, 16.0% system, 0.0% nice, 65.1% idle
CPU1 states: 9.0% user, 29.0% system, 0.0% nice, 61.0% idle
Mem: 1027932K av, 1008448K used, 19484K free, 860K shrd, 34488K buff
Swap: 1333352K av, 341972K used, 991380K free 511280K cached

That's while serving out thousands of hits and numorous scripts in operation. There are numorous ways to control resource usage and not negatively affect user accounts or scripts. I dont usually allow this many accounts on a server, but it got away from me. It runs fine as is, but I will be adding another gig of ram to it. A well tuned and powerful enough machine could easily handle 500+ accounts actually, but you dont want to push your luck unless maybe if you have a quad cpu :D

Anyway, mine is proof that it can work if the SysAdmin knows what he's doing. Signups are off on this server, but stupid me for not watching that particular thing, I would have prefered it not go past 250.

ADEhost
09-18-2002, 04:15 PM
Originally posted by Webdude

Anyway, mine is proof that it can work if the SysAdmin knows what he's doing.

PREACH on brother Webdude,

CAN I GET A WITNESS

Good system admining will give clients the highest performance and the highest statisfaction.

Webdude
09-18-2002, 04:19 PM
Oh hey...I finally made it into The Brotherhood of Administrators :D

rbuecker
09-18-2002, 06:07 PM
Originally posted by chrisb
Using the "uptime" command, my host shows load averages of 4.65, 4.55, and 4.95.

Is that...

1. excellent
2. above normal
3. normal
4. below normal
5. lousy


If you have a SMP box (dual whatever) running slackware linux, and one of the processors is pegged you will see load times of 1+. If it is generally a very busy box you wouldn't need to worry until it stayed close to 2. With a 4-way SMP, change the numbers to 3 and 4. It all depends on how crazy you want to be with the box, too.

If your host is running linux, write a cgi/php (or if you have shell access) page to `cat /proc/cpuinfo` and you'll see what they're running on to make a better decision (if it's lousy or not).

If the load is high only for a short period of time, then they could have been doing backups at the time, that is a cpu intensive activity.

chrisb
09-18-2002, 10:49 PM
And 2-3 hours later, from 4-5 PM CST, the load averages were even higher, around 7. With 440 accts, I believe that the server is overloaded.

Aussie Bob
09-18-2002, 10:51 PM
Originally posted by chrisb
And 2-3 hours later, from 4-5 PM CST, the load averages were even higher, around 7. With 440 accts, I believe that the server is overloaded.
....or maybe a handful of those accounts on the server are running badly configured scripts etc.... ??

What does your host say? Are they regulars on this forum?

Webdude
09-18-2002, 11:18 PM
If your host is running linux, write a cgi/php (or if you have shell access) page to `cat /proc/cpuinfo` and you'll see what they're running on to make a better decision (if it's lousy or not).

He was unable to run top, a well secured server wont allow `cat /proc/cpuinfo` by any other than root either. (or cat /anything owned by root for that fact)

Originally posted by Aussie Bob
....or maybe a handful of those accounts on the server are running badly configured scripts etc.... ??

Actually, I think for it to stay at that level consistently.....it's reading may either be off, or they could have some software in place constantly monitoring things which would use some cpu....though I dont know what would keep it that high all the time. But then, I dont know all the available monitoring software out there either..

baileysemt123
09-19-2002, 12:09 AM
I too would like to hear what the host has to say. A number says very little, if we could get some idea what the current idle CPU time is or what's actually running, then it would be possible to make a judgement.

To say "it's 7, it has 440 accounts, therefore it's overloaded" without the other 95% of the server picture, I don't think is being fair. Heck, we don't even have an URL to observe performance. :(

Earlier it was mentioned that scripts seemed to be running more slowly but web pages were serving fine... if a client reported this to me, the first thing I would look at is their scripts. It actually suggests to me that the scripts being run are poorly-written or too resource-intensive for the shared environment -- which I am sure is against the host's TOS (I know it's against mine, anyways). :)

:D But that's just my two specs of copper. *plink plink* We have like zero data here to work with. *sigh* No offense, Chris, I agree 7 is high, but the server you described is really beefy and it might be 420 tiny accounts pushing zero traffic, for all we know. ;)


:D Granny Girl

chrisb
09-19-2002, 12:13 AM
I'd rather keep my host private for the time being, and hope they don't come in here and give hints to reveal themselves. :)

Anyway, I still can't run top, but...

::::::::::::::
/proc/cpuinfo
::::::::::::::
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 11
model name : Intel(R) Pentium(R) III CPU family 1266MHz
stepping : 1
cpu MHz : 1266.098
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 2529.68
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 11
model name : Intel(R) Pentium(R) III CPU family 1266MHz
stepping : 1
cpu MHz : 1266.098
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 2529.68

What does that mean?

Why does Webdude have approximately the same idle CPU % available that my host has, yet webdude's load averages are <1? Is my server overloaded or not?

baileysemt123
09-19-2002, 01:10 AM
The answer is:

You can't tell from what you've provided and what you have access to. You have to ask your host.

The difference between Webdude's server and your server appears to be information that is root-level restricted. So the folks with root would be the ones to ask.

;)

:D Bailey

Techark
09-19-2002, 01:25 AM
Yep you cannot tell from that information. Server load does not always tell the story as others have pointed out.
Perfect example of that is today while watching TOP on a server I had a run away process that took off and was using 98% of CPU load I could not kill it either. But the whole time server load never showed above 1 while it was eating all the CPU.

I had to reboot to kill the process.

Hercules0124
09-19-2002, 01:26 AM
I believe each points stands for the percentage of usage in decimal format. Say 1.00 means the server is using 100% of CPU capacity. If server have dual processors, the server can handle 2.00 fine. But ussually when there is a high load average, it means some process is hogging CPU load, a HUP or restart on the process will solve it. I don't think any server should run above .75 load . Those that run 8.00 or something, they're probably overloading their server with too many customers.


Quoc Le
Ares Network
http://www.aresnetwork.com

chrisb
09-19-2002, 01:34 AM
Interesting discussion, but I'm still confused..

Some say yeah, others say nay...

Which is it? Regardless of how beefy the server is, should the load average be less than 1 or not? I'd like some more opinions.

HRBrendan
09-19-2002, 02:06 AM
If someone tells you that if a server has a load higher than 1 that it is overloaded, they are basically telling you they flat out don't know what they are talking about.

-Brendan

Webdude
09-19-2002, 02:33 AM
HRBrendan is correct, although I wouldnt put it in such wording. As I have said previously, when testing load capacity I have seen that hit 300.00% before completely crashing. 1.0 cannot equal 100%.

This means 7% is not overloaded. Unless you know a machine inside and out, and exactly what it can do, there's no way to know the premium number which that machine should not go over for sustained periods. I would venture to guess that you dont want anything sustained over 45%.

Let's compare to a better understandable method having to do with bandwidth, as it is similar. CPU's are burstable just like a T-3, or any other line for that fact. If you sustain 7%, you have nothing to worry about yet, but you should be considering future options. In your case, the host should consider not putting anymore accounts on that machine. If you as a user start seeing that sustained # start increasing, then you yourself should consider other options. First question your host, and see what they say. At least make sure you know they know, and that they know you know. If that number does rise, or goes off the charts too often, you should request to be moved to a different machine. If that fails (host wont move you or does nothing about the rising load) then you should consider other hosts.

Right now, you have nothing to worry about. If it stays at 7% and rises to 9% during peak usage or during backing up processes, then you are safe. Put it this way...knowing what I know, and if I were a hosting client and saw a load average sustained of 10% or more, I would be looking at my options. There are others here who know just as much, if not more than I do, I'm sure. You could recommend to your host that if they dont know how to control all resources, they could hire someone like myself (of course not me, they wouldnt trust me being I am a competitor) to go do it for them....as well as secure up the system for them. All they have to do is setup a temporary alternate root login to them system for a trusted contract.

For now Chris, you are safe, but keep an eye on it. Contact your host and see if they are willing to work with you on it. If you are concerned about a sustained load above 5%, see if you can get moved to a lesser loaded machine. It never hurts to discuss it with them as your first option.

Aussie Bob
09-19-2002, 02:37 AM
Chris, without me reading back through this thread - is your site's performance bad? Does it load very slowly etc? Do you feel a negative effect etc??

chrisb
09-19-2002, 03:01 AM
Originally posted by Aussie Bob
Chris, without me reading back through this thread - is your site's performance bad? Does it load very slowly etc? Do you feel a negative effect etc??

Actually, the pages load fast, and I just checked a cgi page and it loaded fast too.

I'm sure you're wondering why I'm concerned about it, if everything loads fast? The answer is that I though a cgi page was loading slowly, but I checked it now and it isn't. Also, I have never noticed load averages this hight at other hosts, and was wondering why. Of course, I doubt the other hosts had servers nearly as beefy, either.

I'm just glad I'm not using a dv2 host. :) Just read where they were down.

Aussie Bob
09-19-2002, 03:16 AM
Originally posted by chrisb
Also, I have never noticed load averages this hight at other hosts, and was wondering why. Of course, I doubt the other hosts had servers nearly as beefy, either.
Plenty of beef there, no worries about that. i forgot, were they SCSI drives with RAID???
I'm just glad I'm not using a dv2 host. :) Just read where they were down.
I think you're talking about them upgrading for more bandwidth providers, other than cogent.
13 240 ms 229 ms 229 ms 64.124.11.181.cogentco.com [64.124.11.181]
14 230 ms 219 ms 219 ms p6-0.core02.sfo01.atlas.cogentco.com [66.28.4.149]
15 280 ms 269 ms 279 ms p14-0.core02.dfw01.atlas.cogentco.com [66.28.4.133]
16 310 ms 289 ms 279 ms p15-0.core01.dfw01.atlas.cogentco.com [66.28.4.25]
17 280 ms 269 ms 279 ms p13-0.core01.iah01.atlas.cogentco.com [66.28.4.98]
18 300 ms 299 ms 309 ms p15-0.core01.tpa01.atlas.cogentco.com [66.28.4.46]
19 310 ms 309 ms 299 ms p5-0.core01.mco01.atlas.cogentco.com [66.28.4.141]
20 300 ms 300 ms 299 ms p14-0.core01.jax01.atlas.cogentco.com [66.28.4.154]
21 310 ms 309 ms 320 ms p5-0.core01.atl01.atlas.cogentco.com [66.28.4.138]
22 319 ms 320 ms 330 ms g49.ba01.b000173-0.atl01.atlas.cogentco.com [66.28.5.242]
23 310 ms 309 ms 310 ms dv2.demarc.cogentco.com [66.28.28.254]
:eek: :blush:

chrisb
09-19-2002, 03:42 AM
Bob, since I answered your question, how 'bout answering mine.

What are your load averages and how many users do you put on a server?
(By "users", say "sue" pays $30/mo for an acct, and say "joe" pays $25/mo for an acct, that's 2 users)

Yes, they are SCSI. Don't know whether they are RAID or not.

Aussie Bob
09-19-2002, 03:55 AM
Originally posted by chrisb
Bob, since I answered your question, how 'bout answering mine.

What are your load averages and how many users do you put on a server?
(By "users", say "sue" pays $30/mo for an acct, and say "joe" pays $25/mo for an acct, that's 2 users)
Not sure if I'm allowed to answer those questions, but I will anyways. Mods feel free to drag me out back and beat me up. :D

Ok. I know I'm opening myself up for heaps of ridicule and attack, but I don't mind. I'll brave it. Will others?? :)

Maximum $1,000/mth revenue per server. A $25/mth is allowed to use 10GB data transfer per month and 1GB disk space. So, that's 40 users only @ the $25/mth account level. Servers have dual 80GB drives etc.

These are our 3 busiest servers -

root@mars [~]# uptime. 3:54am up 1 day, 5:13, 1 user, load average: 1.08, 0.75, 0.82

root@neptune [~]# uptime. 3:53am up 2 days, 13:15, 1 user, load average: 0.27, 0.25, 0.25

root@pluto [~]# uptime. 12:58am up 31 days, 5:39, 1 user, load average: 0.10, 0.06, 0.09

Maintenance work on Mars and Neptune yesterday. Reboots required...

chrisb
09-19-2002, 04:17 AM
Thanks, Bob and that took guts. I appreciate it. Well, you definitely don't overload YOUR servers. Any other hosts here got guts enough to do what Bob did?

Come on, MCHost, Voxtreme, Splashhost, etc, etc., I challenge you to follow suit.

chrisb
09-19-2002, 04:20 AM
BTW, Bob, I like your server names. I'd put them in order from the sun though... Venus, Mars, Earth... :)

Then when you pass 9, you could name them after the stars... just a thought.

Aussie Bob
09-19-2002, 04:24 AM
Originally posted by chrisb
BTW, Bob, I like your server names. I'd put them in order from the sun though... Venus, Mars, Earth... :)
Yeah :blush: I should have done that. Too late now. :bawling:

Haze
09-19-2002, 05:27 AM
Originally posted by Webdude
Believe it or not, I have seen load averages hitting short bursts up to 300% before everything started crashing. I use certain scripts that are meant to crash my servers during testing prior to use. Generally, a server using top of the line hardware and good software can do a sustained 85% for quite a while.


How do you get a processor to use 300%? Is this an overclocked cpu? How exactly do you know its using 300%? How exatly do you determine the %?

coight
09-19-2002, 05:54 AM
root@safari [~]# uptime
9:59am up 23 days, 21:46, 1 user, load average: 0.01, 0.04, 0.01
root@safari [~]#

We have 12 resellers on our reseller machine, paying a variety of prices. The machine is a Dual P3 1.13 :).

We don't market our reseller hosting much, but Im guessing customers are happy because we never get tickets from them :)

AussieHosts
09-19-2002, 06:07 AM
5:16am up 76 days, 5:02, 0 users, load average: 0.02, 0.04, 0.3

Much the same. Though we've got 25 clients on there and that was our limit. It can sit there now and prove its worth. :)

Gary

chrisb
09-19-2002, 06:10 AM
I rescind my challenge. Sorry, I wasn't thinking. I realize now that may have opened a can of worms, and that challenge would give hosts free advertising, and cause us all to get backhanded by the moderators.

chrisb
09-19-2002, 06:19 AM
I like those 12 and 25 customers per box limits. :)

Oh, and can someone please answer Haze's question. I'd like to know too, how you can get 300% CPU Usage?

So, I now understand that when you talk about load averages, you're really talking about CPU usage.

AussieHosts
09-19-2002, 06:28 AM
That one of ours is a P4-1.6. As you said though, this could lead to a whole mess. My advice would be to anyone, to ask their potential provider for the uptime and server stats of the very box they would end up on should they order that day. I'd say many of us have different hardware in different datacentres with different historical results. I know one of our boxes proved to be quite troublesome twice, but its all good now. So posting todays uptime and the specs of a server of our choice is probably not a good overall reflection.

Cheers

Gary

AussieHosts
09-19-2002, 06:39 AM
If you edit your post, my replies look out of place... :)

The load averages represents the number of processes waiting for some CPU time and disk access, over the last 1, 5 and 15 minutes. So a snap shot is not much use.

Cheers

Gary

chrisb
09-19-2002, 06:45 AM
I almost always edit my posts. Sorry, Gary. Always wait 5 minutes before answering a post of mine might be a good idea.

BTW, I blame it on the reply box at the bottom. When we didn't have that, I usually previewed them first.

Aussie Bob
09-19-2002, 06:53 AM
Originally posted by chrisb
I like those 12 and 25 customers per box limits. :)
Not really a good indicator of server performance, Chris. One of those 12 accounts could be promised 350GB data transfer/mth and 20GB disk space. That one account could push the server to hell and back. The other accounts on the server will suffer. It's about having good neighbors in your building. :D

Also if those 12 clients were just smallish normal accounts, those 12 clients would want to be paying around $50/mth each to make the server worthwhile. Also depends on the quality of server/datacenter etc. ;)

It all comes down to how much resources was each account promised and now much money they're parting with each month. The more resources they use, then the more $$$$$ they part with. At the end of the day, it's about money. :)

Aussie Bob
09-19-2002, 07:00 AM
Originally posted by Editor
That one of ours is a P4-1.6. As you said though, this could lead to a whole mess. My advice would be to anyone, to ask their potential provider for the uptime and server stats of the very box they would end up on should they order that day. I'd say many of us have different hardware in different datacentres with different historical results. I know one of our boxes proved to be quite troublesome twice, but its all good now. So posting todays uptime and the specs of a server of our choice is probably not a good overall reflection.

Cheers

Gary
Yes, dozens of variables. :)

coight
09-19-2002, 08:26 AM
But Bob we don't promise that much, or would even provide it. :), dv2 is a good datacentre, when they have verio & williams they will be on par with some of the more known ones.

MCHost-Marc
09-19-2002, 09:05 AM
Originally posted by chrisb
Come on, MCHost, Voxtreme, Splashhost, etc, etc., I challenge you to follow suit.

root@cancun [~]# uptime
9:03am up 5 days, 19:48, 1 user, load average: 0.05, 0.30, 0.15
root@cancun [~]#

root@denver [~]# uptime
9:03am up 13 days, 21:03, 1 user, load average: 0.45, 0.56, 0.54
root@denver [~]#

root@geneva [~]# uptime
9:05am up 46 days, 4:58, 1 user, load average: 0.35, 0.78, 0.84
root@geneva [~]#

root@madrid [~]# uptime
9:10am up 11 days, 21:14, 1 user, load average: 0.28, 0.26, 0.10
root@madrid [~]#

root@maui [~]# uptime
9:06am up 90 days, 6:18, 1 user, load average: 0.82, 1.02, 0.61
root@maui [~]#

root@memphis [~]# uptime
9:07am up 30 days, 11:01, 1 user, load average: 0.01, 0.03, 0.00
root@memphis [~]#

root@miami [~]# uptime
9:08am up 42 days, 22:40, 1 user, load average: 0.50, 0.69, 0.40
root@miami [~]#

root@paris [~]# uptime
9:08am up 90 days, 6:18, 1 user, load average: 0.51, 0.89, 0.55
root@paris [~]#

root@rio [~]# uptime
9:09am up 70 days, 13:40, 1 user, load average: 0.73, 0.67, 0.88
root@rio [~]#

root@seattle [~]# uptime
9:11am up 90 days, 6:21, 1 user, load average: 0.55, 0.88, 0.42
root@seattle [~]#

root@sydney [~]# uptime
9:12am up 5 days, 20:06, 1 user, load average: 0.19, 0.46, 0.70
root@sydney [~]#

:)

KDAWebServices
09-19-2002, 09:33 AM
Originally posted by HRBrendan
If someone tells you that if a server has a load higher than 1 that it is overloaded, they are basically telling you they flat out don't know what they are talking about.

-Brendan
I couldn't have put it better myself, people talk about 1.0 being 100% - that's crap, even if 1.0 were 100% for a single CPU, they are ignoring the fact that the server in question is dual CPU. As stated elsewhere, the CPU idle time is what is important and if the server is thrashing (Using lots of swap space).

Aussie Bob
09-19-2002, 09:40 AM
Originally posted by Kiwi


root@cancun [~]# uptime
9:03am up 5 days, 19:48, 1 user, load average: 0.05, 0.30, 0.15
root@cancun [~]#

root@denver [~]# uptime
9:03am up 13 days, 21:03, 1 user, load average: 0.45, 0.56, 0.54
root@denver [~]#

root@geneva [~]# uptime
9:05am up 46 days, 4:58, 1 user, load average: 0.35, 0.78, 0.84
root@geneva [~]#

root@madrid [~]# uptime
9:10am up 11 days, 21:14, 1 user, load average: 0.28, 0.26, 0.10
root@madrid [~]#

root@maui [~]# uptime
9:06am up 90 days, 6:18, 1 user, load average: 0.82, 1.02, 0.61
root@maui [~]#

root@memphis [~]# uptime
9:07am up 30 days, 11:01, 1 user, load average: 0.01, 0.03, 0.00
root@memphis [~]#

root@miami [~]# uptime
9:08am up 42 days, 22:40, 1 user, load average: 0.50, 0.69, 0.40
root@miami [~]#

root@paris [~]# uptime
9:08am up 90 days, 6:18, 1 user, load average: 0.51, 0.89, 0.55
root@paris [~]#

root@rio [~]# uptime
9:09am up 70 days, 13:40, 1 user, load average: 0.73, 0.67, 0.88
root@rio [~]#

root@seattle [~]# uptime
9:11am up 90 days, 6:21, 1 user, load average: 0.55, 0.88, 0.42
root@seattle [~]#

root@sydney [~]# uptime
9:12am up 5 days, 20:06, 1 user, load average: 0.19, 0.46, 0.70
root@sydney [~]#

:)
Nearly there Kiwi. ;)
Posted by chrisb
What are your load averages and how many users do you put on a server?
(By "users", say "sue" pays $30/mo for an acct, and say "joe" pays $25/mo for an acct, that's 2 users)
Actually, this time of the night [11:45pm through to 5am] is the quietest time for server/forum/helpdesk activity for us. :)

I like it quiet. :D:agree:

Webdude
09-19-2002, 11:22 AM
Originally posted by Haze


How do you get a processor to use 300%? Is this an overclocked cpu? How exactly do you know its using 300%? How exatly do you determine the %?

load average: 812.23, 482.54, 203.64

Someone just sent me that. That's higher than even I have seen before. Haze you have to look at it like this. When is a balloon full, how much more air can it actually handle before it explodes? That's about the only way I can explain it..

fcsnc
09-19-2002, 11:52 AM
To reinforce what Gary said (and to correct some of the apparent confusion), a quote from the man pages:

"uptime prints the current time, the length of time the system has been up, the number of users logged on to the system, and the average number of jobs in the run queue over the last 1, 5, and 15 minutes."

Job means, basically, a task. If I click on a hyperlink, that's going to wind up being at least one task on your server. If your load average is over 1.00, then I am going to see no response to my click until at least <your load average here> tasks get to stop waiting go ahead and do whatever it is they need to do.

As others have pointed out, there is no one factor (CPU, RAM, disk drive configuration, filesystem configuration, switching and routing within the data center, etc.) that you can blame for your high load average. However, I think it's fair to say [again, I am your user/visitor speaking, not your accountant or your upstream ISP or your head tech guru] that you have a problem.

It makes no difference to your client whether you are I/O bound or bandwidth-inhibited or your disk drives are slow or you have a 486 instead of a P4 or you put too many vhosts in your /home directory. Your client will just look for a faster host.

mdrussell
09-19-2002, 12:13 PM
root@manta [~]# uptime
4:17pm up 23 days, 10:23, 1 user, load average: 0.28, 0.47, 0.45

root@barracuda [~]# uptime
4:19pm up 22 days, 12:04, 2 users, load average: 0.66, 0.82, 1.01

root@swordfish [~]# uptime
4:19pm up 10 days, 21:58, 1 user, load average: 0.56, 0.39, 0.45


Just a small sample, 3 of our dual proc servers.

dynamicnet
09-19-2002, 12:40 PM
Greetings:

From our seven year history of working with Sun, Linux, and Windows... anything over 1 is generally high.

But you also have to look at what's running on the box, and the configuration.

Even with that stated, 2 or more would be high in most day-to-day web hosting instances.

Thank you.

Deb
09-19-2002, 02:00 PM
Originally posted by chrisb
Come on, MCHost, Voxtreme, Splashhost, etc, etc., I challenge you to follow suit.

==> TAZ <==
Thu Sep 19 13:57:10 EDT 2002
1:57pm up 27 days, 10:32, 4 users, load average: 0.35, 0.35, 0.40

==> SIX <==
Thu Sep 19 13:57:13 EDT 2002
1:57pm up 27 days, 10:37, 5 users, load average: 0.55, 0.70, 0.69

==> NINE <==
Thu Sep 19 13:57:14 EDT 2002
1:57pm up 27 days, 10:35, 2 users, load average: 0.86, 0.98, 0.83

==> SEVEN <==
Thu Sep 19 13:57:15 EDT 2002
1:57pm up 27 days, 10:37, 3 users, load average: 0.40, 0.54, 0.60

==> ASTRO <==
Thu Sep 19 13:57:16 EDT 2002
1:57pm up 27 days, 11:20, 6 users, load average: 0.38, 0.47, 0.48

==> PHOENIX <==
Thu Sep 19 13:57:17 EDT 2002
1:57pm up 27 days, 11:08, 2 users, load average: 0.48, 0.53, 0.66

==> DEXTER <==
Thu Sep 19 13:57:20 EDT 2002
1:57pm up 27 days, 11:23, 3 users, load average: 0.86, 1.10, 1.18

==> DEEDEE <==
Thu Sep 19 13:57:21 EDT 2002
1:57pm up 27 days, 11:24, 2 users, load average: 0.82, 0.74, 0.70

==> RASMUS <==
Thu Sep 19 13:57:23 EDT 2002
1:57pm up 27 days, 11:19, 3 users, load average: 0.73, 0.59, 0.59

==> QBERT <==
Thu Sep 19 13:57:24 EDT 2002
1:57pm up 12 days, 22:51, 5 users, load average: 0.33, 0.61, 1.00

==> LOLA <==
Thu Sep 19 13:57:25 EDT 2002
1:57pm up 27 days, 11:39, 2 users, load average: 0.43, 0.49, 0.61

==> ZOOMER <==
Thu Sep 19 13:57:26 EDT 2002
1:57pm up 27 days, 12:04, 2 users, load average: 0.85, 0.49, 0.30

==> ESCHER <==
Thu Sep 19 13:57:27 EDT 2002
1:57pm up 27 days, 11:50, 4 users, load average: 0.18, 0.23, 0.26

==> HUGO <==
Thu Sep 19 13:57:28 EDT 2002
1:57pm up 27 days, 10:44, 2 users, load average: 0.40, 0.39, 0.31

==> UNITY <==
Thu Sep 19 13:57:29 EDT 2002
1:57pm up 27 days, 11:27, 2 users, load average: 0.36, 0.34, 0.33

==> HANNA <==
Thu Sep 19 13:57:30 EDT 2002
1:57pm up 27 days, 11:14, 3 users, load average: 0.32, 0.44, 0.42

==> SONIC <==
Thu Sep 19 13:57:31 EDT 2002
1:57pm up 27 days, 10:51, 4 users, load average: 0.72, 0.52, 0.56

==> HC01 <==
Thu Sep 19 13:57:32 EDT 2002
1:57pm up 27 days, 10:27, 1 user, load average: 0.26, 0.32, 0.54

==> MYSQL01 <==
Thu Sep 19 13:58:30 EDT 2002
1:58pm up 165 days, 12:14, 2 users, load average: 0.17, 0.27, 0.26

==> MYSQL02 <==
Thu Sep 19 14:02:25 EDT 2002
1:58pm up 15 days, 5:44, 2 users, load average: 0.11, 0.05, 0.06

==> MYSQL03 <==
Thu Sep 19 13:58:26 EDT 2002
1:58pm up 166 days, 8:28, 1 user, load average: 0.66, 0.69, 0.93

==> MYSQL04 <==
Thu Sep 19 13:59:13 EDT 2002
1:59pm up 10 days, 18:44, 1 user, load average: 0.05, 0.18, 0.19

==> MYSQL05 <==
Thu Sep 19 13:58:36 EDT 2002
1:58pm up 166 days, 7:56, 2 users, load average: 0.85, 0.82, 0.73


For the record: All of the above were taken at prime time e.g a weekday between 1:57pm and 2pm EDT. The Community Servers range in equipment from Dual PIII 600s to PIII 1Ghz Processors with 1 Gigabyte of RAM, and SCSI hard drive arrays.

As far as load averages, there will always be highs and lows. Spikes are common with shared servers. I would be more concerned about a load average that doesn't go down than a high load average that is only temporary due to a spider that decided to visit or script that needed to run. I'm also aware of situations where the load average is artificial. There are many factors to consider overall.

fcsnc
09-19-2002, 03:28 PM
Originally posted by Deb

As far as load averages, there will always be highs and lows. Spikes are common with shared servers. I would be more concerned about a load average that doesn't go down than a high load average that is only temporary due to a spider that decided to visit or script that needed to run. I'm also aware of situations where the load average is artificial. There are many factors to consider overall.

Very well put, Deb. I have taken to sending people to the futurequest site as an example of a host with real value for the price.

I'll bet you don't put 3,000 vhosts on any single one of those servers, either. That approach has got to help with the spikes & troughs.

Deb
09-19-2002, 03:47 PM
Originally posted by fcsnc
I'll bet you don't put 3,000 vhosts on any single one of those servers, either. That approach has got to help with the spikes & troughs. Not overloading the servers certainly helps to keep the constant load averages down, among other things, however, I doubt it helps to prevent real high spikes since those are most often caused by a single script/activity. I've seen single files on a single site send servers into oblivion. So in reality I could place just one site on each server and still experience high spikes in the load averages depending on what that single site is doing.

Overloading creates headaches for everyone and ends up costing a lot more in the end. The key to speed and profits is actually found by under-loading the servers. Less is a lot more in hosting. You are more than correct in that area.

Thanks for the kudos ;)

Haze
09-19-2002, 06:04 PM
Originally posted by Webdude


load average: 812.23, 482.54, 203.64

Someone just sent me that. That's higher than even I have seen before. Haze you have to look at it like this. When is a balloon full, how much more air can it actually handle before it explodes? That's about the only way I can explain it..
That doesn't explain how you get the %age. If you think a load of 300.00 is %age, you are sadly mistaken my friend. Did you perhaps mean load average and not percentage? Or do you have a way of finding out the %age?

Webdude
09-19-2002, 06:22 PM
I found this in a newsgroup:::

CPU load is the percentage of time that cpu had a process running on it. Load average is the number of processes that could be running at any given time.

Most processes (including server processes) spend the majority of their lives waiting for something (i/o, other processes, whatever), so don't really increase the load on the server. CPU usage charts are probably the best way to measure it when usage is much below 50%, but bove that it's important to note how many processes are vieing for the CPU(s) when it's in use.

For some systems, 10 is normal. For others, it's a crisis. Like many performance stats, you simply have to monitor the system under "normal" conditions so you know what's wrong in an abnormal state.

Uptime - The System Load average

The uptime command is used to display the system load average. The load average is defined as the average number of processes in the run queue during a particular interval. Generally, a process is in the run queue if:

- It is not waiting for terminal I/O
- It is not in a voluntary wait (it hasn't called 'wait')
- It is not stopped (e.g, waiting to terminate)

The uptime command displays one line of output that contains

- The current time
- the length of time the system has been up
- the average number of jobs in the run queue in the last 1,5 and 15
min

For example,

$ uptime
10:47am up 55 day(s), 20:42, 14 users, load average: 0.79, 0.62, 0.59

This example was performed at 10:47 in the morning. The system has been up for 55 days, 22h, and 42 min. On average, there are 14 users. In the last minute, the number of active processes was 0.79, in the 5 min, 0.62

Well, I had always considered "Load Averages" more of a "Stress Level". In a way, I guess it is.

chrisb
09-19-2002, 06:29 PM
Finally ran "top"

Here's my results. Comments?

4:33pm up 13:28, 2 users, load average: 11.52, 6.53, 5.33
373 processes: 371 sleeping, 1 running, 1 zombie, 0 stopped
CPU0 states: 34.0% user, 19.3% system, 0.0% nice, 46.0% idle
CPU1 states: 26.1% user, 21.2% system, 0.0% nice, 52.0% idle
Mem: 3997096K av, 3616332K used, 380764K free, 0K shrd, 209448K buff
Swap: 1052216K av, 82244K used, 969972K free 2830304K cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
21466 seatenth 10 0 5048 5048 1656 D 9.5 0.1 0:00 neomail.pl
20870 nobody 9 0 20992 19M 18236 S 4.7 0.5 0:00 httpd
21177 nobody 9 0 20808 19M 18280 S 4.7 0.5 0:00 httpd
20824 nobody 9 0 21352 20M 18284 S 4.0 0.5 0:00 httpd
20953 bestbs 19 0 1240 1240 828 R 4.0 0.0 0:01 top
21173 nobody 9 0 19700 18M 18232 S 3.4 0.4 0:00 httpd
20822 nobody 11 0 20764 19M 18280 S 2.8 0.5 0:00 httpd
21026 nobody 9 0 20992 19M 18272 S 2.1 0.5 0:00 httpd
20965 nobody 9 0 20700 19M 18256 S 1.9 0.4 0:00 httpd
20994 nobody 9 0 19764 18M 18192 S 1.7 0.4 0:00 httpd
21042 nobody 13 0 21024 19M 18220 S 1.5 0.5 0:00 httpd
21140 nobody 8 0 20752 19M 17380 S 1.5 0.5 0:00 httpd
20876 nobody 9 0 21028 19M 18212 S 1.3 0.5 0:00 httpd
21471 wdisneyw 10 0 2696 2696 1512 D 1.3 0.0 0:00 ads.pl
21176 nobody 9 0 19564 18M 18296 S 0.7 0.4 0:00 httpd
148 root 10 0 0 0 0 DW 0.5 0.0 0:39 kjournald

chrisb
09-19-2002, 07:10 PM
Here's another a few minutes later.

5:14pm up 14:09, 3 users, load average: 13.55, 8.38, 6.30
342 processes: 340 sleeping, 2 running, 0 zombie, 0 stopped
CPU0 states: 13.0% user, 24.0% system, 0.0% nice, 61.1% idle
CPU1 states: 13.0% user, 14.1% system, 0.0% nice, 71.1% idle
Mem: 3997096K av, 3660872K used, 336224K free, 0K shrd, 221420K buff
Swap: 1052216K av, 81444K used, 970772K free 2865884K cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
12782 bestws 16 0 1236 1232 844 R 14.8 0.0 0:00 top
12382 nobody 11 0 23696 22M 21260 S 10.7 0.5 0:00 httpd
12477 nobody 9 0 22664 21M 21192 S 4.9 0.5 0:00 httpd
12365 nobody 9 0 23404 22M 21124 S 4.1 0.5 0:00 httpd
12787 mysql 10 0 21264 13M 1988 S 3.3 0.3 0:00 mysqld
12352 nobody 10 0 23056 21M 21248 S 2.4 0.5 0:00 httpd
4624 named 10 0 75892 69M 2352 S 0.8 1.7 25:54 named
5248 mysql 10 0 21260 13M 1988 S 0.8 0.3 0:39 mysqld
12150 nobody 9 0 23744 22M 22300 S 0.8 0.5 0:00 httpd
12161 nobody 9 0 22048 20M 21060 S 0.8 0.5 0:00 httpd
12165 nobody 9 0 23772 22M 21236 S 0.8 0.5 0:00 httpd
1 root 8 0 520 480 452 S 0.0 0.0 0:06 init
2 root 9 0 0 0 0 SW 0.0 0.0 0:00 keventd
3 root 19 19 0 0 0 SWN 0.0 0.0 0:01 ksoftirqd_CPU0
4 root 19 19 0 0 0 SWN 0.0 0.0 0:01 ksoftirqd_CPU1
5 root 9 0 0 0 0 SW 0.0 0.0 0:28 kswapd

This really has me worried now. Should I be worried?

Haze
09-19-2002, 07:19 PM
Originally posted by chrisb
This really has me worried now. Should I be worried?

I really think anything constantly above 4.00 ish you should be worried. If it were me, I would be looking for a new host.

Paul L.
09-19-2002, 07:47 PM
CPU0 states: 13.0% user, 24.0% system, 0.0% nice, 61.1% idle
CPU1 states: 13.0% user, 14.1% system, 0.0% nice, 71.1% idle


Enough said, top numbers mean nothing you need to look at the idle Cpu.

Chris if your not happy find a new host, we can beat this to death people saying this and that about top numbers but the real numbers speak for them self, that server has alot of Idle CPU and Ram left and thats what counts.

Haze
09-19-2002, 07:56 PM
You got a point Paul. There is tones of free ram ( tho it looks like the server was just rebooted not long ago, and the free command is a much better measurement of actuall usage ) and the cpu seems to be taking it quite well.

Chrisb: do you notice any kind of decrease in performance at all?

tribby
09-19-2002, 08:05 PM
BTW, ls / home | wc -l isn't a good way of seeing how many users are on the system. If it's a CPanel box, try this instead: ls /var/cpanel/users | wc -l

Aussie Bob
09-19-2002, 08:22 PM
BTW, this is a really good thread. Good open meaty discussion. :agree:

AceWeb
09-19-2002, 08:43 PM
Originally posted by chrisb
This really has me worried now. Should I be worried?

Yeah, you really should, unless it is a one time deal. Sometimes things like that will happen, and it is ok, they are fixed, but if that happens often, then yeah, look for a new host.

chrisb
09-19-2002, 09:36 PM
Originally posted by Haze
You got a point Paul. There is tones of free ram ( tho it looks like the server was just rebooted not long ago, and the free command is a much better measurement of actuall usage ) and the cpu seems to be taking it quite well.

Chrisb: do you notice any kind of decrease in performance at all?

Here's what the free commmand shows at 8:33 pm CST
total used free shared buffers cached
Mem: 3997096 3985104 11992 0 248072 3005676
-/+ buffers/cache: 731356 3265740
Swap: 1052216 80208 972008
sh-2.05$

How does that look?

My pages and cgi's are loading fast. However the load average is down to around 3.5 now at 8:40 pm CST.

Paul L: I'm trying to understand what it all means, and see if I am wrongly concerned. I never said I was unhappy with my host.

Aussie Bob
09-20-2002, 04:34 AM
Originally posted by chrisb
My pages and cgi's are loading fast.
I think that's the bottom line, Chris. Your host is delivering you with good performance and that's what really matters. Although the loads are a worry.

KDAWebServices
09-20-2002, 05:26 AM
Originally posted by AceWeb


Yeah, you really should, unless it is a one time deal. Sometimes things like that will happen, and it is ok, they are fixed, but if that happens often, then yeah, look for a new host.
No offence, but I don't think you quite know what you're talking about. The loads are not a reflection on how loaded the server is or how well it is performing, the most important bits of info are:

CPU Idle Time
Free RAM
Shop File in Use

They are all going to give better performance indicators than some number that most poeple on the planet don't even know means.

AceWeb
09-20-2002, 08:09 AM
I may now be an expert, but do know that the average should not be more then 1.0. As I said, it is ok when it is high sometimes, but the constant load should usually not exceed 1.0 (that is if it is a single processor) or 2.0 if it is a duel processor.

HRBrendan
09-20-2002, 10:17 AM
Originally posted by AceWeb
I may now be an expert, but do know that the average should not be more then 1.0. As I said, it is ok when it is high sometimes, but the constant load should usually not exceed 1.0 (that is if it is a single processor) or 2.0 if it is a duel processor.

That is something that people assume to be correct because it would be the logical conclusion but it is not necessarily true.

-Brendan

magnafix
09-20-2002, 10:30 AM
Load average means how many processes are waiting for cpu cycles, or ram, or disk IO. I have seen a load of 15 and command line access is fine (NFS server timout, causing lots of web processes to pile up, hence driving up load). I have also seen a loadavg of 3 and command line access is choppy (while compiling a kernel or something).

We use a load-balanced cluster setup so all sites are served by all webservers. This allows us to buy relatively low-end servers (1Ghz, 512 RAM, tiny hard drives) and add more as generalized load increases. It also means we never have to 'shuffle' sites from server to server.

Our webservers generally run between .5-1.0 load during the day, and handle as much as 10-15 requests/second each. But, the odd customer script can occasionally spike load to 5+, at which point support gets beeped and we sometimes have to kill things off.

Side note - I sure don't like the idea of any customer being able to 'ls /home' at all. :eek: All our customers are chroot'd inside their homedirs so nobody has any clue about the number or names of other sites.

Aussie Bob
09-20-2002, 10:33 AM
Originally posted by KDAWebServices

No offence, but I don't think you quite know what you're talking about. The loads are not a reflection on how loaded the server is or how well it is performing, the most important bits of info are:

CPU Idle Time
Free RAM
Shop File in Use

They are all going to give better performance indicators than some number that most poeple on the planet don't even know means.
So how would you compile that data into a figure to give a true indication of the server's performance?

magnafix
09-20-2002, 10:43 AM
'man vmstat' -- it's a bear to get used to and really interpret properly, but it's probably the best tool available to ascertain and diagnose performance.

Webdude
09-20-2002, 11:14 AM
Originally posted by AceWeb
I may now be an expert, but do know that the average should not be more then 1.0. As I said, it is ok when it is high sometimes, but the constant load should usually not exceed 1.0 (that is if it is a single processor) or 2.0 if it is a duel processor.

Do I hear that false buzzer going off? Yep, that's definitely a false buzzer....no mistaking it :D

chrisb
09-22-2002, 11:29 PM
Thanks for all of your input. I have canceled my host, and unfortunately have to look again. I really tried to make it work at this host, but could not. Their support was excellent, and that was never a problem.

However, recently load averages kept rising, and idle ram became less...

8:24pm MT Sunday Sept 22
total * used * free * shared * buffers * cached Mem:
3997096 * 3955872 * 41224 * 0 * 114608 * 2951268 -/+ buffers/cache:
* * 889996 * 3107100 Swap: * * 1052216 * * 85612 * * 966604

I don't quite understand what all of the numbers mean, but what I do know is that 5% of the time, I got an error that the "server is too busy" and could not get to my pages, so I'm guessing that those people that said the server was overloaded were correct.

There were other reasons for cancellation also, such as no choice of C Panel skins, items removed from C Panel and WHM; and server downtime.

As a courtesy to this host, I will not mention their name, and make them look bad, unless they choose to come into this thread and discuss it.

AceWeb
09-23-2002, 12:37 AM
These number represent how much memory you have (total), how much is used, and how much is left. Same info for swap.

Let’s look at my top, for example:

Mem: 1021692K av, 795140K used, 226552K free, 0K shrd, 22548K buff
Swap: 1048784K av, 21448K used, 1027336K free

So the total memory on the server is about a gig.
Used about 795MB and 226 MB free.

Then there is swap, if the memory is full, the server will start using swap.


By the way, what are the stats with your new host? Is it better?

AceWeb
09-23-2002, 12:42 AM
Also, another thing I tell to my clients is about a program called phpSysInfo. It shows all of these numbers in a graphical interface, in addition to other server stats. You can get it at: http://phpsysinfo.sourceforge.net and it is very easy to get it running.

chrisb
09-23-2002, 01:14 AM
Thanks for that explanation. I haven't found a new host yet, but I'm looking for one whose load averages are consistently <1, and a host that doesn't put over 100 people on a server next time. :)

AceWeb
09-23-2002, 01:33 AM
You’re Welcome.
The number of people on the server does not matter much; it is the size of those sites, which in a way would reflect on the server load. I can have 100 sites, but 99 of them could be personal sites and make not difference on the server, but that one site could cause all of the problems. Or 100 sites could be parked sites too.

Something you want to try is seeing if a host can give you a trial, that way you can get a feel of it and see how things work for you. At least is how I get customers that want hosting but too scary to put their feet on it.

chrisb
09-23-2002, 01:42 AM
Yeah, I understand that argument that number of customers on a server doesn't matter. It's been argued many times here recently. To me it does matter because the more customers you have on a server, the more it increases the chances of someone screwing up the server... but that's another thread. :)

Haze
09-23-2002, 01:45 AM
If your going to be that picky, your better off getting a managed dedicated server, or perhaps a managed virtual dedicated server. Mosts hosts that I know of, normally pack betweet 250 to 500 accounts per server. Sometimes more, but its unlikely to see less.

AceWeb
09-23-2002, 01:47 AM
True, but it depends. Once again, if they are personal sites, do not have SSH access, and get very low amount of hits, it is a waste of server space for me. At anyrate, I do see your point, but it would be hard to find such a host (that is honest about it).

chrisb
09-23-2002, 01:56 AM
Yeah, I understand that if they have the hardware, etc., that performance will not suffer even if there are many accounts. I said 100 for ideal, but I would probably settle for 200 other users on the same server. But over 500... no way!

HAZE: I knew someone would come along and tell me that I was too picky, and needed to get my own server. :)

AceWeb
09-23-2002, 02:15 AM
Well good luck in your search.

By the way, so you site is totaly down now (I assume) ?

Aussie Bob
09-23-2002, 02:28 AM
Originally posted by chrisb
Yeah, I understand that argument that number of customers on a server doesn't matter. It's been argued many times here recently. To me it does matter because the more customers you have on a server, the more it increases the chances of someone screwing up the server... but that's another thread. :)
:D:agree:

Richard Ward
09-23-2002, 09:59 AM
9:44AM up 174 days, 13:32, 23 users, load averages: 7.20, 7.21, 8.17

Each number represents the length of the system run queue averaged over 1 minute, 5 minutes, and 15 minutes, respectively. It's difficult to diagnose problems via these three averages. This is where 'top' comes into play. 'top' displays the top processes on the system and periodically updates this information. This includes memory/swap usage, CPU states, and a whole lot more. The averages displayed in w, uptime, etc. are often misleading to even "professionals" of UNIX environments.

And for the record, the high loads above are because that machine is handling hundreds of IRC connections on a single 500 MHz Intel Celeron under-clocked to 233 MHz w/ 96MB PC100 SDRAM. The loads of that machine have been fluctuating between 6.0 and 8.0 for almost 10 months. No problems so far.