
|
View Full Version : Server load at 61- NO cpu usage
CCF Hosting 02-16-2004, 11:14 PM Hello,
We have a P4 2.8Ghz HT server w/ 512mb RAM at SM. They are also checking into this, however the load was at 61 and there is but 5% CPU usage. The iowait was VERY high, but what is that?
Here is the TOP info:
22:11:33 up 15 days, 5:54, 1 user, load average: 61.52, 40.58, 27.27
430 processes: 429 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.5% 0.0% 1.3% 0.0% 0.1% 97.8% 0.0%
cpu00 0.7% 0.0% 1.3% 0.0% 0.1% 97.6% 0.0%
cpu01 0.3% 0.0% 1.3% 0.0% 0.1% 98.0% 0.0%
Mem: 495776k av, 405756k used, 90020k free, 0k shrd, 5016k buff
264296k actv, 37916k in_d, 6408k in_c
Swap: 1052248k av, 244556k used, 807692k free 99552k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
22682 root 15 0 1792 1756 816 R 0.3 0.3 0:24 1 top
17617 sponder 15 0 1216 960 820 S 0.2 0.1 0:02 0 sshd: sponder@pts/0
29606 mysql 15 0 34016 16M 1028 D 0.1 3.3 0:00 1 /usr/sbin/mysqld --basedir=/ --datadir=/var/lib/mysql --us
1 root 15 0 492 464 436 S 0.0 0.0 6:58 0 init
2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
4 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
9 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
7 root 15 0 0 0 0 SW 0.0 0.0 3:15 1 kswapd
8 root 15 0 0 0 0 SW 0.0 0.0 1:30 1 kscand
10 root 15 0 0 0 0 SW 0.0 0.0 0:31 0 kupdated
11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
18 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 katad-1
20 root 25 0 0 0 0 SW 0.0 0.0 0:00 1 scsi_eh_0
24 root 15 0 0 0 0 DW 0.0 0.0 77:39 0 kjournald
80 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 khubd
3559 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald
3561 root 15 0 0 0 0 SW 0.0 0.0 0:03 1 loop0
3974 root 15 0 244 212 164 S 0.0 0.0 0:25 0 syslogd -m 0
3978 root 15 0 200 188 144 S 0.0 0.0 0:10 0 klogd -x
3988 root 15 0 248 240 192 S 0.0 0.0 1:08 0 irqbalance
6074 root 15 0 392 180 164 S 0.0 0.0 0:00 0 /usr/sbin/sshd
6087 root 15 0 340 296 224 S 0.0 0.0 0:01 0 xinetd -stayalive -pidfile /var/run/xinetd.pid
6096 root 15 0 380 304 168 S 0.0 0.0 0:56 1 antirelayd
6106 root 15 0 1672 776 204 S 0.0 0.1 0:29 1 chkservd
6128 root 15 0 376 308 168 S 0.0 0.0 0:51 0 antirelayd
6144 root 15 0 204 180 120 S 0.0 0.0 0:05 0 crond
6290 cpanel 25 0 356 88 84 S 0.0 0.0 0:00 1 /usr/bin/stunnel-4.04local /usr/local/cpanel/etc/stunnel/d
6296 root 16 0 1636 340 196 S 0.0 0.0 0:00 0 whostmgrd
6310 root 15 0 140 92 88 S 0.0 0.0 0:00 0 rhnsd --interval 240
6327 root 25 0 1204 16 12 S 0.0 0.0 0:00 0 /usr/bin/perl /usr/local/bin/ipalert_statd
stdunbar 02-16-2004, 11:58 PM Load is how many processes are waiting for CPU time, not how much CPU is being used. The numbers are the load average for the last 1, 5, and 15 minutes.
Something is really taking a ton of I/O - maybe disk, maybe network. top isn't showing enough. There are a total of 430 processes running but you're only seeing a few.
Try a "ps -efm" - this will show all processes and the threads associated with them. As the threading model on Linux is, um, challenged, it's important to understand how many threads are running too.
hiryuu 02-17-2004, 12:15 AM Looking at your cache level and swap in use, I'd guess you're thrashing. Try the command 'vmstat 5 5' to see what level of IO activity is going on.
I've also seen that when a disk died. It's unlikely, but you may want to see if there are any disk-related errors near the end of 'dmesg'.
CCF Hosting 02-17-2004, 12:25 AM Hello,
What does thrashing mean?
Thanks and God Bless!
stdunbar 02-17-2004, 01:01 AM Thrashing is a term for what happens when the system is low on memory and so it has to put processes into swap. The operating system uses swap space to conceptually expand the amount of available memory. Say a process hasn't run in a while. The O/S takes the space that is used in physical memory and writes it to disk in the swap partition. Then the O/S can use that memory for another process.
Thrashing occurs when one or more of the processes that the O/S just wrote to disk now needs to do something (for example, an I/O operation finished). But the O/S is low on memory so it can only afford to bring that process back into memory for a little bit of time before it has to write it back to disk. Then that process again needs to come back into physical memory and so on. The system gets I/O bound but the CPU can't do a whole lot because it is waiting for swaped out processes to come back into physical memory so that it can continue to run them.
I agree with hiryuu that you may be thrashing though typically you will see a greater percentage of both physical and swap space in use. You've got about 90MB of free physical memory (out of your 512MB) and 807MB of swap free out of a gig.
One thing that hiryuu mentions that I haven't seen but makes alot of sense is that you are having a disk error. If a particular sector that, say, your MySQL process is accessing is nominal then it may take a long time to read that sector, seriously slowing down MySQL.
I'd ask that you attach a 'vmstat 5 5' as hiryuu suggested. I'm also interested in seeing what processes are running including threads.
CCF Hosting 02-17-2004, 01:11 AM Hello,
I appeciate your help. I will give that info ASAP, as soon as the mail stops sending. The server is only used as a mail server. Sending about 14.5/emails per second or more. However, earlier last week mySQL failed, which caused the mysql db to lock up which made us loose 1 day and 2 hrs of emailing time.
I had cPanel fix the DB, but I am now wondering if that mySQL crash had anything to do with this problem now. Other services such as Webmail and apache etc, failed along with it too.
It seems the I/O only goes up when there are no processes running, but drops down to normal when 101 (figure of speech) exim or other processes start up.
At the moment it does not seem to be doing that, but I will not give up because I cannot afford a server to go down out of the blue especially when I am heading for Floriday tomorrow afternoon!
Here are the results of the command you asked me to run, remember exim mail server was running at the time.
[/var/log]# vmstat 5 5
procs memory swap io system cpu
r b swpd free buff cache si so bi bo in cs us sy id wa
9 0 91996 20072 36432 234868 3 5 7 9 8 6 8 6 16 5
4 0 91996 20608 36432 234936 0 0 2 5109 1003 2548 27 30 2 41
3 5 91996 24292 36432 234972 0 0 7 5242 1095 2871 31 33 2 34
1 0 91996 34192 36432 234684 0 0 0 724 246 408 29 26 43 2
4 1 91996 31232 36432 235060 11 0 29 5257 1065 2639 32 32 3 33
Thanks and God Bless!
CCF Hosting 02-17-2004, 01:20 AM Here is the server running all good and nice:
# vmstat 5 5
procs memory swap io system cpu
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 91968 56036 34444 256520 3 5 7 10 8 6 8 6 16 5
0 0 91968 57716 34464 256508 0 0 6 257 145 146 1 1 96 1
0 0 91968 57856 34464 256516 0 0 1 50 108 46 0 0 100 0
0 0 91968 57992 34464 256504 0 0 0 60 116 44 0 0 100 0
1 1 91968 53992 34532 256704 0 0 50 1131 155 137 19 23 51 6
Thanks and God Bless!
Steven 02-17-2004, 01:40 AM try upgrading the ram to a gig, should help performence
stdunbar 02-17-2004, 01:50 AM Could you attach a ps -efm?
CCF Hosting 02-17-2004, 09:02 AM Hello,
Steve: The server was working just fine until around 9pm yesterday. Then it just started up doing that. It was sending around 14.5 emails per second at a load under 8.
Here it is when the server is not active: Load at 0.02
# ps -efm
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Feb01 ? 00:07:24 init
root 2 0 0 Feb01 ? 00:00:00 [migration/0]
root 3 0 0 Feb01 ? 00:00:00 [migration/1]
root 4 1 0 Feb01 ? 00:00:00 [keventd]
root 5 1 0 Feb01 ? 00:00:00 [ksoftirqd/0]
root 6 1 0 Feb01 ? 00:00:00 [ksoftirqd/1]
root 9 1 0 Feb01 ? 00:00:00 [bdflush]
root 7 1 0 Feb01 ? 00:03:26 [kswapd]
root 8 1 0 Feb01 ? 00:01:36 [kscand]
root 10 1 0 Feb01 ? 00:00:32 [kupdated]
root 11 1 0 Feb01 ? 00:00:00 [mdrecoveryd]
root 18 1 0 Feb01 ? 00:00:00 [katad-1]
root 20 1 0 Feb01 ? 00:00:00 [scsi_eh_0]
root 21 1 0 Feb01 ? 00:00:00 [scsi_eh_1]
root 24 1 0 Feb01 ? 01:21:24 [kjournald]
root 80 1 0 Feb01 ? 00:00:00 [khubd]
root 3559 1 0 Feb01 ? 00:00:00 [kjournald]
root 3561 1 0 Feb01 ? 00:00:03 [loop0]
root 3974 1 0 Feb01 ? 00:00:26 syslogd -m 0
root 3978 1 0 Feb01 ? 00:00:10 klogd -x
root 3988 1 0 Feb01 ? 00:01:09 irqbalance
root 6074 1 0 Feb01 ? 00:00:00 /usr/sbin/sshd
root 6087 1 0 Feb01 ? 00:00:01 xinetd -stayalive -pidfile /var/
root 6096 1 0 Feb01 ? 00:00:58 antirelayd
root 6106 1 0 Feb01 ? 00:00:30 chkservd
root 6128 1 0 Feb01 ? 00:00:53 antirelayd
root 6144 1 0 Feb01 ? 00:00:05 crond
cpanel 6290 1 0 Feb01 ? 00:00:00 /usr/bin/stunnel-4.04local /usr/
root 6296 1 0 Feb01 ? 00:00:00 whostmgrd
root 6310 1 0 Feb01 ? 00:00:00 rhnsd --interval 240
root 6327 1 0 Feb01 ? 00:00:00 /usr/bin/perl /usr/local/bin/ipa
root 6346 1 0 Feb01 ? 00:00:00 /usr/sbin/portsentry -tcp
root 6436 1 0 Feb01 ? 00:00:24 /usr/local/urchin/bin/urchinwebd
nobody 6437 6436 0 Feb01 ? 00:00:00 /usr/local/urchin/bin/urchinwebd
nobody 6438 6436 0 Feb01 ? 00:00:00 /usr/local/urchin/bin/urchinwebd
nobody 6439 6436 0 Feb01 ? 00:00:00 /usr/local/urchin/bin/urchinwebd
nobody 6440 6436 0 Feb01 ? 00:00:00 /usr/local/urchin/bin/urchinwebd
nobody 6441 6436 0 Feb01 ? 00:00:00 /usr/local/urchin/bin/urchinwebd
root 6458 1 0 Feb01 ? 00:00:03 mdmpd
root 6459 6458 0 Feb01 ? 00:00:00 mdmpd
root 6465 1 0 Feb01 tty1 00:00:00 /sbin/mingetty tty1
root 6466 1 0 Feb01 tty2 00:00:00 /sbin/mingetty tty2
root 6467 1 0 Feb01 tty3 00:00:00 /sbin/mingetty tty3
root 6468 1 0 Feb01 tty4 00:00:00 /sbin/mingetty tty4
root 6469 1 0 Feb01 tty5 00:00:00 /sbin/mingetty tty5
root 6470 1 0 Feb01 tty6 00:00:00 /sbin/mingetty tty6
root 6614 1 0 Feb01 ? 00:00:01 pure-ftpd (SERVER)
root 6621 1 0 Feb01 ? 00:00:00 /usr/sbin/pure-authd -s /var/run
root 6709 1 0 Feb01 ? 00:00:00 /bin/sh /usr/bin/mysqld_safe --d
root 7147 1 0 Feb01 ? 00:24:49 /usr/bin/perl /usr/local/cpanel/
root 2772 1 0 Feb02 ? 00:00:53 antirelayd
root 2935 1 0 Feb02 ? 00:00:49 antirelayd
root 2970 1 0 Feb02 ? 00:01:12 cpanellogd - sleeping for logs
root 2971 1 0 Feb02 ? 00:00:00 cpaneld - listening port 2082
root 2977 1 0 Feb02 ? 00:00:00 cppop - accepting on port 110
mailnull 3098 1 0 Feb02 ? 00:00:17 /usr/sbin/exim -bd -q5m
mailnull 3105 1 0 Feb02 ? 00:00:00 /usr/sbin/exim -tls-on-connect -
root 3109 1 0 Feb02 ? 00:00:51 antirelayd
mysql 12857 6709 0 Feb11 ? 00:00:14 /usr/sbin/mysqld --basedir=/ --d
mysql 12877 12857 0 Feb11 ? 00:00:19 /usr/sbin/mysqld --basedir=/ --d
mysql 12878 12877 0 Feb11 ? 00:00:00 /usr/sbin/mysqld --basedir=/ --d
mysql 12879 12877 0 Feb11 ? 00:00:00 /usr/sbin/mysqld --basedir=/ --d
mysql 12880 12877 0 Feb11 ? 00:00:00 /usr/sbin/mysqld --basedir=/ --d
mysql 12881 12877 0 Feb11 ? 00:00:00 /usr/sbin/mysqld --basedir=/ --d
mysql 12907 12877 0 Feb11 ? 00:00:00 /usr/sbin/mysqld --basedir=/ --d
mysql 12908 12877 0 Feb11 ? 00:01:14 /usr/sbin/mysqld --basedir=/ --d
mysql 12909 12877 0 Feb11 ? 00:00:00 /usr/sbin/mysqld --basedir=/ --d
mysql 12910 12877 0 Feb11 ? 00:00:06 /usr/sbin/mysqld --basedir=/ --d
mysql 12931 12877 0 Feb11 ? 00:17:00 /usr/sbin/mysqld --basedir=/ --d
named 12968 1 0 Feb11 ? 00:00:00 /usr/sbin/named -u named
named 12970 12968 0 Feb11 ? 00:09:32 /usr/sbin/named -u named
named 12971 12968 0 Feb11 ? 00:09:29 /usr/sbin/named -u named
named 12972 12968 0 Feb11 ? 00:00:06 /usr/sbin/named -u named
named 12973 12968 0 Feb11 ? 00:05:53 /usr/sbin/named -u named
root 13331 1 0 Feb11 ? 00:00:00 webmaild
root 27871 1 0 Feb15 ? 00:00:04 cupsd
root 17461 6296 0 Feb16 ? 00:00:00 whostmgrd
root 23488 1 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 23510 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 23511 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 23512 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 23513 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 23514 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 23520 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 25970 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 29078 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 29204 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
nobody 29607 23488 0 Feb16 ? 00:00:00 /usr/local/apache/bin/httpd -DSS
root 18629 6074 0 07:58 ? 00:00:00 sshd: sponder [priv]
sponder 18631 18629 0 07:58 ? 00:00:00 sshd: sponder@pts/0
sponder 18632 18631 0 07:58 pts/0 00:00:00 -bash
root 18660 18632 0 07:58 pts/0 00:00:00 su -
root 18661 18660 0 07:59 pts/0 00:00:00 -bash
root 18719 18661 0 07:59 pts/0 00:00:00 ps -efm
Thanks and God Bless!
CCF Hosting 02-24-2004, 10:40 AM Hello,
SM finally did the HD and RAM check, came back fine.
Wanna see the world's highest CPU usage:
09:35:49 up 7:58, 1 user, load average: 570.44, 329.20, 170.53
3041 processes: 3039 sleeping, 2 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.2% 0.0% 1.4% 0.1% 0.7% 97.4% 0.0%
cpu00 0.0% 0.0% 0.9% 0.1% 0.7% 98.0% 0.0%
cpu01 0.4% 0.0% 1.9% 0.1% 0.6% 96.8% 0.0%
Mem: 495776k av, 480152k used, 15624k free, 0k shrd, 3268k buff
260084k actv, 18872k in_d, 5952k in_c
Swap: 1052248k av, 768848k used, 283400k free 15304k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
7 root 15 0 0 0 0 SW 0.1 0.0 0:17 1 kswapd
15169 root 15 0 3520 3520 820 R 0.1 0.7 0:06 1 top
1 root 15 0 496 468 440 S 0.0 0.0 0:05 0 init
2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
4 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
9 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
8 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 kscand
10 root 15 0 0 0 0 SW 0.0 0.0 0:00 1 kupdated
11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
18 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 katad-1
PLEASE HELP!!
Thanks and God Bless!
Steven 02-24-2004, 10:43 AM Well thats not the highest cpu ive ever seen. 700+ is. You do have a lot of processes, maybe httpd is getting alot of requests, more then your server can handle
CCF Hosting 02-24-2004, 10:51 AM Hello,
It is not a HTTP server. http://www.goldsponder.com, all you see is GoldSponder and that is as far as you get. It is a mail server using Perl and mySQL and the script was NOT even running.
I am going desperate. About to have SM do a OS Reload, but don't want that.
Thanks and God Bless!
coight 02-24-2004, 12:57 PM Sounds like a poor written script in a loop!
Nice swap usage ;)
CCF Hosting 02-24-2004, 01:07 PM LOL, we have it setup on a few servers and it worked on this server just fine then all of a sudden it decided not to work and raise the IO load.
I am becoming VERY desperate on getting this working. Please let me know what I can do.
Thanks and God Bless!
MattMans 02-24-2004, 01:16 PM What were the 3041 processes when the load was at 570?
CCF Hosting 02-24-2004, 02:37 PM Hello,
I am not sure. Most likly processes that were dead in the IO that could not be used. What I gave you was all I had.
Here is the current output, however only one is running out of 106.
ps -aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1524 468 ? S 01:37 0:38 init
root 2 0.0 0.0 0 0 ? SW 01:37 0:00 [migration/0]
root 3 0.0 0.0 0 0 ? SW 01:37 0:00 [migration/1]
root 4 0.0 0.0 0 0 ? SW 01:37 0:00 [keventd]
root 5 0.0 0.0 0 0 ? SWN 01:37 0:00 [ksoftirqd/0]
root 6 0.0 0.0 0 0 ? SWN 01:37 0:00 [ksoftirqd/1]
root 9 0.0 0.0 0 0 ? SW 01:37 0:00 [bdflush]
root 7 0.0 0.0 0 0 ? SW 01:37 0:34 [kswapd]
root 8 0.0 0.0 0 0 ? SW 01:37 0:32 [kscand]
root 10 0.0 0.0 0 0 ? SW 01:37 0:01 [kupdated]
root 11 0.0 0.0 0 0 ? SW 01:37 0:00 [mdrecoveryd]
root 18 0.0 0.0 0 0 ? SW 01:37 0:00 [katad-1]
root 20 0.0 0.0 0 0 ? SW 01:37 0:00 [scsi_eh_0]
root 21 0.0 0.0 0 0 ? SW 01:37 0:00 [scsi_eh_1]
root 24 0.7 0.0 0 0 ? SW 01:37 5:15 [kjournald]
root 80 0.0 0.0 0 0 ? SW 01:37 0:00 [khubd]
root 598 0.0 0.0 0 0 ? SW 01:37 0:00 [kjournald]
root 3961 0.0 0.0 1588 460 ? S 01:38 0:00 syslogd -m 0
root 3965 0.0 0.0 1508 324 ? S 01:38 0:00 klogd -x
root 3975 0.0 0.0 1520 412 ? S 01:38 0:02 irqbalance
root 6111 0.0 0.1 7568 692 ? S 01:38 0:00 cupsd
root 6148 0.0 0.1 3556 584 ? S 01:38 0:00 /usr/sbin/sshd
root 6161 0.0 0.1 2144 624 ? S 01:38 0:00 xinetd -stayalive
root 6170 0.0 0.1 4896 748 ? S 01:38 0:02 antirelayd
root 6180 0.0 0.3 6516 1512 ? S 01:38 0:01 chkservd
mailnull 6193 0.0 0.1 4468 588 ? S 01:38 0:00 /usr/sbin/exim -b
mailnull 6198 0.0 0.0 4468 340 ? S 01:38 0:00 /usr/sbin/exim -t
root 6202 0.0 0.1 2836 744 ? S 01:38 0:02 antirelayd
root 6211 0.0 0.0 1572 436 ? S 01:38 0:00 crond
xfs 6248 0.0 0.0 5092 388 ? S 01:38 0:00 xfs -droppriv -da
root 6405 0.0 0.4 6872 2360 ? SN 01:38 0:00 cpanellogd - slee
root 6407 0.0 0.0 6080 320 ? S 01:38 0:00 cpaneld - listeni
root 6417 0.0 0.1 8980 516 ? S 01:38 0:00 /usr/local/apache
root 6433 0.0 0.1 4712 608 ? S 01:38 0:00 webmaild
nobody 6453 0.0 0.2 8980 1332 ? S 01:38 0:00 /usr/local/apache
nobody 6456 0.0 0.3 8980 1780 ? S 01:38 0:00 /usr/local/apache
nobody 6458 0.0 0.3 8980 1740 ? S 01:38 0:00 /usr/local/apache
nobody 6460 0.0 0.3 8980 1624 ? S 01:38 0:00 /usr/local/apache
nobody 6462 0.0 0.3 8980 1748 ? S 01:38 0:00 /usr/local/apache
named 6536 0.2 2.3 55932 11840 ? S 01:38 2:05 /usr/sbin/named -
cpanel 6549 0.0 0.0 14056 408 ? S 01:38 0:00 /usr/bin/stunnel-
root 6568 0.0 0.0 5040 404 ? S 01:38 0:00 whostmgrd
root 6582 0.0 0.0 3548 320 ? S 01:38 0:00 rhnsd --interval
root 6598 0.0 0.3 7048 1692 ? S 01:38 0:00 /usr/bin/perl /us
root 6622 0.0 0.0 1532 220 ? S 01:38 0:00 /usr/sbin/portsen
root 6653 0.0 0.0 3452 384 ? S 01:38 0:00 /usr/local/urchin
nobody 6654 0.0 0.0 3452 324 ? S 01:38 0:00 /usr/local/urchin
nobody 6655 0.0 0.0 3452 324 ? S 01:38 0:00 /usr/local/urchin
nobody 6656 0.0 0.0 3452 324 ? S 01:38 0:00 /usr/local/urchin
nobody 6657 0.0 0.0 3452 324 ? S 01:38 0:00 /usr/local/urchin
nobody 6658 0.0 0.0 3452 324 ? S 01:38 0:00 /usr/local/urchin
nobody 6660 0.0 0.0 664 124 ? S 01:38 0:02 /usr/local/urchin
root 6675 0.0 2.4 12152 12148 ? SL 01:38 0:00 mdmpd
root 6683 0.0 0.0 1500 200 tty2 S 01:38 0:00 /sbin/mingetty tt
root 6684 0.0 0.0 1504 200 tty3 S 01:38 0:00 /sbin/mingetty tt
root 6685 0.0 0.0 1504 200 tty4 S 01:38 0:00 /sbin/mingetty tt
root 6686 0.0 0.0 1512 200 tty5 S 01:38 0:00 /sbin/mingetty tt
root 6687 0.0 0.0 1500 200 tty6 S 01:38 0:00 /sbin/mingetty tt
root 6895 0.0 0.1 4116 552 ? S 01:38 0:00 pure-ftpd (SERVER
root 6902 0.0 0.0 3744 232 ? S 01:38 0:00 /usr/sbin/pure-au
root 6944 0.0 0.0 2192 316 ? S 01:39 0:00 /bin/sh /usr/bin/
mysql 6963 0.0 2.6 46748 13000 ? S 01:39 0:01 /usr/sbin/mysqld
mysql 6964 0.0 2.6 46748 13000 ? S 01:39 0:01 /usr/sbin/mysqld
mysql 6965 0.0 2.6 46748 13000 ? S 01:39 0:00 /usr/sbin/mysqld
mysql 6966 0.0 2.6 46748 13000 ? S 01:39 0:00 /usr/sbin/mysqld
mysql 6967 0.0 2.6 46748 13000 ? S 01:39 0:00 /usr/sbin/mysqld
mysql 6968 0.0 2.6 46748 13000 ? S 01:39 0:00 /usr/sbin/mysqld
mysql 6969 0.0 2.6 46748 13000 ? S 01:39 0:00 /usr/sbin/mysqld
mysql 6970 0.0 2.6 46748 13000 ? S 01:39 0:05 /usr/sbin/mysqld
mysql 6971 0.0 2.6 46748 13000 ? S 01:39 0:00 /usr/sbin/mysqld
mysql 6972 0.0 2.6 46748 13000 ? S 01:39 0:00 /usr/sbin/mysqld
nobody 7146 0.0 0.3 8980 1624 ? S 01:43 0:00 /usr/local/apache
root 7272 0.1 0.2 7644 1244 ? S 01:47 0:54 /usr/bin/perl /us
mysql 7273 0.1 2.6 46748 13000 ? S 01:47 1:19 /usr/sbin/mysqld
root 7498 0.0 0.0 1508 200 tty1 S 01:49 0:00 /sbin/mingetty tt
nobody 23748 0.0 0.3 8980 1808 ? S 08:30 0:00 /usr/local/apache
nobody 878 0.0 0.2 8980 1252 ? S 08:46 0:00 /usr/local/apache
nobody 14418 0.0 0.3 8980 1740 ? S 09:24 0:00 /usr/local/apache
nobody 14419 0.0 0.3 8980 1748 ? S 09:24 0:00 /usr/local/apache
root 3834 0.1 0.1 4872 824 ? S 12:28 0:05 /usr/sbin/exim -q
root 6866 0.0 0.2 5004 1472 ? S 12:48 0:01 /usr/sbin/exim -q
root 5133 0.0 0.3 5132 1644 ? S 13:13 0:00 /usr/sbin/exim -q
root 17577 0.0 0.3 5132 1732 ? S 13:18 0:00 /usr/sbin/exim -q
root 24642 0.0 0.4 5132 2080 ? S 13:23 0:00 /usr/sbin/exim -q
root 26125 0.0 0.5 5288 2668 ? S 13:24 0:00 /usr/sbin/exim -q
mailnull 26132 0.0 0.5 5292 2760 ? S 13:24 0:00 /usr/sbin/exim -q
root 30315 0.0 0.4 5140 2072 ? S 13:28 0:00 /usr/sbin/exim -q
root 304 0.0 0.4 5132 2072 ? S 13:33 0:00 /usr/sbin/exim -q
root 1112 0.0 0.6 5412 3052 ? S 13:34 0:00 /usr/sbin/exim -q
mailnull 1114 0.0 0.6 5416 3084 ? S 13:34 0:00 /usr/sbin/exim -q
root 1279 0.0 0.5 5412 2788 ? S 13:34 0:00 /usr/sbin/exim -q
mailnull 1281 0.0 0.5 5416 2864 ? S 13:34 0:00 /usr/sbin/exim -q
root 1943 0.0 0.4 5160 2128 ? S 13:34 0:00 /usr/sbin/exim -q
mailnull 1944 0.0 0.4 5164 2212 ? S 13:34 0:00 /usr/sbin/exim -q
root 1964 0.0 0.3 6760 1708 ? S 13:35 0:00 sshd: sponder [pr
sponder 1967 0.0 0.3 6772 1964 ? S 13:35 0:00 sshd: sponder@pts
sponder 1978 0.0 0.2 4272 1340 pts/0 S 13:35 0:00 -bash
root 2012 0.0 0.1 4212 968 pts/0 S 13:35 0:00 su -
root 2013 0.0 0.2 4272 1356 pts/0 S 13:35 0:00 -bash
root 2157 0.0 0.5 5416 2848 ? S 13:35 0:00 /usr/sbin/exim -q
mailnull 2158 0.0 0.5 5420 2892 ? S 13:35 0:00 /usr/sbin/exim -q
root 2189 0.0 0.6 5416 3060 ? S 13:35 0:00 /usr/sbin/exim -q
mailnull 2190 0.0 0.6 5420 3092 ? S 13:35 0:00 /usr/sbin/exim -q
root 2193 0.0 0.6 5420 3052 ? S 13:35 0:00 /usr/sbin/exim -q
mailnull 2194 0.0 0.6 5424 3084 ? S 13:35 0:00 /usr/sbin/exim -q
root 2201 0.0 0.1 2732 788 pts/0 R 13:35 0:00 ps -aux
thedavid 02-24-2004, 03:23 PM Whatever is happening, you're swapping like crazy due to low memory conditions. Either:
1) Someone's hitting your box very hard all at once (lots of exim due to a mail flood, synfloods causing too many apache children to be created, etc)
2) You have an application that has a memory leak, and it just grows and grows in size till it's too big to handle.
Next time you see tons of processes, do a 'ps -aux > /root/ps.txt' before you reboot to fix it (I assume that's what you're doing). You should see (likely) a few processes repeated a ton of times in the ps.txt file that will create - go from there.
VapoRub 02-24-2004, 06:38 PM You might want to hire a freelance linux administrator to take a thorough look at it.
CCF Hosting 02-24-2004, 06:51 PM Hello,
I have contacted David Baker. He will be looking through it. It is a Mail Server using Exim, sends many thousands of emails per day.
Thanks and God Bless!
xp101 02-28-2004, 03:02 AM Hi CCF Hosting,
Have you ever solved this problem? I have exact same problem on heavy traffic; for low traffic seems to have no problem. Please let me know what was wrong.
Thank you!
CCF Hosting 02-28-2004, 03:30 AM Hello,
I am still looking into this, however seems either of the two.
1) Exim - Emailing going to Yahoo is being bounced because of too many relays to bad addresses in a period of time.
2) Kernel - In Kernel version x.2.1, with IOWAIT, other people have had problems with this type of thing.
However, I know I will find an answer. I have Level 2 cPanel techs working on this, Server Matrix working on this and David Baker working on this. If all three of these geniuses can't figure it out, I will shut down my hosting business and sell soap models of famous celebrities.
** Techs - Is meant as a compliment, not a put down. **
Thanks and God Bless!
|