Results 1 to 19 of 19

Thread: Server health..

  1. #1
    Join Date
    Feb 2002
    Location
    Australia
    Posts
    24,009

    Server health..

    Is there a way to compile data from sources such as load, CPU idle etc, to come up with a figure that could be representative of the server's real health?
    AussieHost.com Aussie Bob, host since 2001
    Host Multiple Domains on Fast Australian Servers!!

  2. #2
    Join Date
    Nov 2001
    Posts
    5,383
    Im interested to.
    Clustered Hosting With Continuous Data Protection (CDP)
    http://www.solidinternet.com
    8 Years of hosting excellence!

  3. #3
    Join Date
    Aug 2002
    Location
    Australia
    Posts
    47
    Interested x 2
    Vinh,
    ShockHost - host me up scotty
    http://www.ShockHost.com
    (Coming soon to a world near you next month)

  4. #4
    Join Date
    Apr 2001
    Posts
    2,588
    What exactly do you mean by the servers health?

  5. #5
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    9,037
    maybe use mrtg graphs of the info needed ? the mrtg logs could then be used to create some sort of value.
    Matt Wallis
    United Communications Limited
    High Performance Shared & Reseller | Managed VPS Cloud | Managed Dedicated
    UK www.unitedhosting.co.uk | US www.unitedhosting.com | Since 1998.

  6. #6
    Join Date
    May 2001
    Posts
    1,513
    um, health means condition... ie, poor, good, excellent

  7. #7
    Join Date
    Feb 2002
    Location
    Australia
    Posts
    24,009
    MY request is kind of a contination of chrisb's thread about server loads etc and folks saying that's not a good indication of the server's performance. I was just wondering if there was a formulea for gathering different pieces of data into 1 figure and thus getting a picture of server performance....
    AussieHost.com Aussie Bob, host since 2001
    Host Multiple Domains on Fast Australian Servers!!

  8. #8
    Join Date
    Apr 2001
    Posts
    2,588
    Hmm, I'm not sure if a program can give you an accurate view of your servers health really. It ultimately depends on various factors and those various factors differ at times.

    Take Chris' thread for instance. His servers load was high as heck, yet there was enough free RAM available to keep the server under control and it seemed things were going just fine, even with a server load that high. Thats not to say I would be comfortable hosting a site with such a high load, but its really not a determination of the systems health.

    Then we have anile8's threadhere, where he it is apparent to me that he is running short on RAM.

    My point is, there are a number of factors that can contribute to a server slowing down or becoming "unhealthy". Now I'm no programmer but I think such an idea would be rather impossible to put into place because of all the different variables that come to the final value. I think the best method to date is a competent system admin.

  9. #9
    Join Date
    May 2001
    Posts
    1,513
    Just for the record, my server was NOT running fine. I thought I had problems, but later said pages loaded fast and without a problem because I didn't remember for sure, and recent tests ran fine, and I also wanted to give the host the benefit of the doubt.

    Within the last few days, I noticed connection problems that were returning "server busy" errors. Also, that host admitted they were overloaded and had people running scripts that were too intensive and would give them a week to remove them, and they also stated that some of their accounts on that server had grown, so in combination with other things, I canceled my acct there.

    Now, the first indication of that problem for me was the server load averages, and since every host that posted their load avgs was <1, it seems to me that load avgs do shed light. Though they may not tell the entire story, they seem to be a pointer to other problems.

  10. #10
    Join Date
    Apr 2001
    Posts
    2,588
    Originally posted by chrisb
    Now, the first indication of that problem for me was the server load averages, and since every host that posted their load avgs was <1, it seems to me that load avgs do shed light. Though they may not tell the entire story, they seem to be a pointer to other problems.
    I thought that would cause a problem sooner or later.

  11. #11
    Join Date
    Nov 2001
    Posts
    5,383
    Beau that's why Bob, said cpu idle time free ram etc.
    Clustered Hosting With Continuous Data Protection (CDP)
    http://www.solidinternet.com
    8 Years of hosting excellence!

  12. #12
    Join Date
    May 2001
    Posts
    1,513
    If I knew what things to compare, I could write a script. Maybe someone smarter than me like umBillyCord or bitserver will dorp in, and help us out.

  13. #13

    Re: Server health..

    Originally posted by Aussie Bob
    Is there a way to compile data from sources such as load, CPU idle etc, to come up with a figure that could be representative of the server's real health?
    As a *very* rough benchmark:

    $loadaverage is your 15 minute load average (the third value) as reported by `uptime`.
    $numprocs is the number of processors in your machine.
    $tmem is the total amount of memory in your machine.
    $memu is the "used" + "shrd" memory, as reported by `top` (for linux), or the "active" memory, as reported by `top` (for *BSD).
    $swpu is the "used" swap space, as reported by `top`.
    $dide is the number of IDE drives in your machine.
    $dscs is the number of SCSI drives in your machine.
    $dtps is the sum of the "tps" values shown by `iostat -d`.

    1. If your $loadaverage/$numprocs is less than 1.0, score 25 points.
    2. If your $loadaverage/$numprocs is between 1.0 and 2.0, score 20 points.
    3. If your $loadaverage/$numprocs is between 2.0 and 5.0, score 12 points.
    4. If your $loadaverage/$numprocs is above 5.0, score 0 points.
    5. If your $memu/$tmem is below 0.5, score 25 points.
    6. If your $memu/$tmem is between 0.5 and 0.75, score 20 points.
    7. If your $memu/$tmem is between 0.75 and 0.9, score 12 points.
    8. If your $memu/$tmem is above 0.9, score 0 points.
    9. If your $swpu/$tmem is below 0.25, score 25 points.
    10. If your $swpu/$tmem is between 0.25 and 0.5, score 20 points.
    11. If your $swpu/$tmem is between 0.5 and 1.5, score 12 points.
    12. If your $swpu/$tmem is above 1.5, score 0 points.
    13. If your $dtps/(100*$dide+150*$dscs) is below 0.25, score 25 points.
    14. If your $dtps/(100*$dide+150*$dscs) is between 0.25 and 0.5, score 20 points.
    15. If your $dtps/(100*$dide+150*$dscs) is between 0.5 and 1.0, score 12 points.
    16. If your $dtps/(100*$dide+150*$dscs) is above 1.0, score 0 points.

    You should now have a total score between 0 and 100:
    If your score is 100: Your server seems to be in excellent condition.
    If your score is 90-99: Your server is doing pretty well.
    If your score is 76-89: Not too bad, but you should probably look at where you lost points and see if you can do anything about that.
    If your score is 50-75: This is getting pretty bad. You should definitely look at upgrading the system or reducing the load on it.
    If your score is 0-49: That server is SICK.
    Last edited by cperciva; 09-23-2002 at 02:16 PM.
    Dr. Colin Percival, FreeBSD Security Officer
    Online backups for the truly paranoid: http://www.tarsnap.com/

  14. #14
    Join Date
    Aug 2002
    Location
    London, UK
    Posts
    9,037
    lets make the 4 parts out of 25.. then you get a nice clean score out of 100
    Matt Wallis
    United Communications Limited
    High Performance Shared & Reseller | Managed VPS Cloud | Managed Dedicated
    UK www.unitedhosting.co.uk | US www.unitedhosting.com | Since 1998.

  15. #15
    Originally posted by UH-Matt
    lets make the 4 parts out of 25.. then you get a nice clean score out of 100
    Happy now?
    Dr. Colin Percival, FreeBSD Security Officer
    Online backups for the truly paranoid: http://www.tarsnap.com/

  16. #16
    Join Date
    May 2001
    Posts
    1,513
    Hey Bob,
    I haven't forgot. I am writing the script.
    -chris

    <EDIT> Just wanted to let you know that I started writing it, but can't get access to some of the info since I don't have root privileges. I also don't know how valid the results would be either, so I may not continue... especially since I'm not getting paid to write it.
    Last edited by chrisb; 09-26-2002 at 06:12 AM.

  17. #17
    Join Date
    May 2001
    Posts
    1,513
    Hey Bob,
    Like I stated above, I'm not sure how valid the other poster's equation is or how helpful the results will be. I started on it, but it seems to me that writing further would be a waste of time since you can see most of that info at a glance from either ps aux or top. Anyway, I share with you what I wrote:

    #!/usr/bin/perl -w
    use strict;
    print "Content-type: text/html\n\n";
    ### written by chrisb

    $|=1; # Flush the buffer

    ### Number of processors on your machine my $numprocs = "1";

    ### Load Average in last 15 minutes
    my $loadavg = `uptime 2>/dev/null`;
    my @loadavg = split(/,/, $loadavg);
    $loadavg = $loadavg[5];
    print "$loadavg<BR><BR>";
    my $la = $loadavg/$numprocs;

    if ( $la le '1' )
    {
    $loadavg = '25'
    }
    elsif ( ($la gt '1') & ($la le '2') )
    {
    $loadavg = '20'
    }
    elsif ( ($la gt '2') & ($la le '3') )
    {
    $loadavg = '12'
    }
    else
    {
    $loadavg = '00'
    };

    print "$loadavg<BR><BR>\n";

    print `top -bn 1`;

  18. #18
    Originally posted by chrisb
    Like I stated above, I'm not sure how valid the other poster's equation is or how helpful the results will be.
    I think I remarked that it was a very rough formula. If you understand what the input variables are, and why they matter, you'll be able to work out for yourself how a server is doing, much better than the formula I gave -- it's only going to be at all useful for the people who come in with questions like "My load average is 123.45, is that bad?"
    Dr. Colin Percival, FreeBSD Security Officer
    Online backups for the truly paranoid: http://www.tarsnap.com/

  19. #19
    Join Date
    May 2001
    Posts
    1,513
    Originally posted by cperciva


    I think I remarked that it was a very rough formula. If you understand what the input variables are, and why they matter, you'll be able to work out for yourself how a server is doing, much better than the formula I gave -- it's only going to be at all useful for the people who come in with questions like "My load average is 123.45, is that bad?"
    I didn't mean it as a criticism. Yes, you stated it was just a rough estimate. Anyhow, I just posted what I did, in case anyone wants to write further on it. BTW, there's no error checking or tainting because this was written for a root user on a suEXEC-enabled server, though it can also be run as non-root. Darnnit, my indentions on those elsif statements didn't come out in the post. Oh well.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •