Web Hosting Talk







View Full Version : restarting all services


Lem0nHead
06-23-2004, 01:02 AM
hello

i'm debugging a weird problem on my server
does anyone know how can i restart ALL currently running services?

thanks

thaphantom
06-23-2004, 01:05 AM
restart the server?

eddy2099
06-23-2004, 01:06 AM
Which operating system ? One best method might be to reboot.

Haze
06-23-2004, 01:11 AM
cd /etc/init.d/

In there you will see various servers, no doubt the one ya want is probably in here. Just run "servicename restart" for each one you want restarted. It shouldn't be to hard to create a bash script that does this too.

thaphantom
06-23-2004, 01:16 AM
cd /etc/init.d/
for i in*;do service $i restart;done
That will also do it.

Steven
06-23-2004, 01:42 AM
you got a problem. You got some lame-o services that will get started that should not be started.

Lem0nHead
06-23-2004, 11:02 AM
Originally posted by thaphantom
restart the server?

Originally posted by eddy2099
Which operating system ? One best method might be to reboot.

Linux with RHEL

the problem is exactly that restarting the server resolvs my problem (high load)
the load gets high as 5~6 and, after I restart, it lows to 0.2
i already tried to restart httpd and mysqld

now I would like to restart all of them so I can know if it's some service of "the system" that's causing the problem

EMT-Chris
06-23-2004, 11:08 AM
First thing to check in high load, make sure you are not cutting into swap.

chris

2uantuM
06-23-2004, 11:15 AM
I love how people try to solve high load problems :P

EMT-Chris
06-23-2004, 11:38 AM
Originally posted by 2uantuM
I love how people try to solve high load problems :P

Glad you're amuzed...

eth00
06-23-2004, 12:02 PM
What is taking all of the cpu power? Rather then having everything restart you should look at what is causing the problem and only restart that. Of course that is not the real solution, the real solution is to figure out what is causing the loads to get so high. It might even be related to your server being not big enough.

Lem0nHead
06-23-2004, 03:41 PM
Originally posted by 2uantuM
I love how people try to solve high load problems :P

as i said, i'm trying to DEBUG
i won't restart services everytime i have high load, just want to know if it's SOME service that's causing that

i've some experience with linux and:
1) my server is not swapping
2) there's no process consuming resources being showed on 'top'
3) that happens (i already openend a thread saying that) when I run a script that consumes resources
that's weird, but my server load is like 0.2... then I run a backup for example and the server loads goes to 4~5... after the backup ENDS, the load doesn't low to 0.2 again

also, it LOOKS LIKE that's not a high load, but that it's a problem in the way how linux is calculing the load... i say that because i made some benchmarks and it's acting almost the same as when the load is 0.2

EMT-Chris
06-23-2004, 04:13 PM
Where does it state the load is 5? TOP will display 3 different loads and they are averages between different time periods. So if you have a load of 0.2 then you have a load of 6 for 10 minutes then you stop loading he server, you're load for the last 15 minutes will be 4 (not exact, just giving example).

Chris

Lem0nHead
06-23-2004, 04:17 PM
17:16:19 up 1 day, 14:33, 1 user, load average: 6.05, 5.58, 5.47
152 processes: 150 sleeping, 1 running, 0 zombie, 1 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
cpu00 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
cpu01 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
Mem: 1032556k av, 1017596k used, 14960k free, 0k shrd, 87832k buff
345480k active, 598056k inactive
Swap: 2040212k av, 152552k used, 1887660k free 527232k cached

3110 root 18 0 1300 1300 868 R 0.7 0.1 0:00 0 top
6828 nobody 12 0 40288 19M 18456 S 0.5 1.9 0:00 1 httpd
14894 nobody 12 0 40208 19M 18260 S 0.5 1.9 0:00 0 httpd
10245 nobody 11 0 52760 31M 28540 S 0.4 3.1 0:03 1 httpd
9102 nobody 11 0 48708 27M 26044 S 0.3 2.7 0:01 1 httpd
21116 nobody 10 0 56812 35M 31712 S 0.2 3.5 0:03 1 httpd
18705 nobody 11 0 59520 38M 33904 S 0.2 3.8 0:04 1 httpd
16113 nobody 10 0 38576 18M 17308 S 0.2 1.7 0:00 0 httpd
7855 nobody 10 0 37052 16M 16736 S 0.2 1.6 0:00 1 httpd
25209 nobody 10 0 37024 16M 16732 S 0.2 1.6 0:00 1 httpd
6264 nobody 10 0 39724 19M 17988 S 0.2 1.9 0:00 0 httpd

EMT-Chris
06-23-2004, 04:20 PM
You need memory. You're Swapping.

Chris

EMT-Chris
06-23-2004, 04:22 PM
Note: Dont run HTTPD as Nobody.

Chris

Lem0nHead
06-23-2004, 04:23 PM
Originally posted by EMT-Chris
You need memory. You're Swapping.

Chris

that's not the problem
there're 14 MBs free
even if i had 500 MBs free, there would still have more than 100 MBs on swap...

also, as I said, it just happens when I run a program that consumes resources

you can see that i/o wait = 0.0% (sometimes it goes to 33%)


well... back to the point...
does anyone know how to restart currently running scripts without starting non-running ones?

thanks

Lem0nHead
06-23-2004, 04:25 PM
Originally posted by EMT-Chris
Note: Dont run HTTPD as Nobody.

Chris

??

all cpanel runs it as nobody by default
you recomment running it as what?

EMT-Chris
06-23-2004, 04:26 PM
You will always have Free Memory, thats why your system uses swap, to free memory. Your swap usage should be 0kb.

Chances are when you do your backup is causes the system to use more memory then causes it to go into swap. Then your system cant recover from the load.

Chris

EMT-Chris
06-23-2004, 04:28 PM
Originally posted by Lem0nHead
??

all cpanel runs it as nobody by default
you recomment running it as what?


create a user called http or something.

Chris

Lem0nHead
06-23-2004, 04:29 PM
Originally posted by EMT-Chris
create a user called http or something.

Chris

what's the difference between a user called 'nobody' that is used just to run httpd and a user called 'http' that is used just to run httpd?

EMT-Chris
06-23-2004, 04:41 PM
Sorry, I was side tracked talking with someone while I was reading your TOP output. Many people post here running Apache under root and thats what I was thinking of. Nobody is fine, as nobody should have no permessions on your server. Sorry for confusion.

Chris

Haze
06-23-2004, 06:41 PM
Hrm, ok, first of all EMT-Chris:

Swap usage, is not a bad thing at all, especially on unix type systems. When its not being used and hasn't been free'd by the program itself, the information is taken from the memory chip(s) and stored on the hard drive as swap. If its called again, its placed back in memory if an area is free, or sometimes called direct from the hard drive. Swap usage will almost always be in use, its the way it all works.

Lem0nHead:

From the looks of things, i'd think it may be a hardware issue, perhaps a fan not working ( maybe the cpu fan ? ). You may want to get someone to do a check, just in case.

Other than that, Memory usage actually looks ok, never hurts to have more ram, but i think your just fine, ATM.

Are there any strange happenings in the logs? Have you tailed them for a while, just to see if you could pick out anything odd?

Perhaps maybe even try installing a new kernel?

linux-tech
06-23-2004, 06:41 PM
Looks like (in this case) http is the problem. Edit the configuration, check and make sure you don't have stuff you don't need in there, make sure you've got just enough to get by (plus a little extra) as far as maxclients, servers and the like.

Lem0nHead
06-23-2004, 07:03 PM
Haze, wolfstream

doesn't seen to be a hardware or httpd problem because, as I said, once I reboot the server, the load goes back to normal

i really think that's a problem in the way linux is measuring the load
when I'm doing backups or something that really makes the load go high, it's slower even to do login to shell
that's not the case now... and scripts are loading in normal time (phpbb around 0.2 seconds... when the load is 5~6 is loads in more than 1 second)

that said, is there another way to track the load of the system?

thanks

Edit: I checked /var/log/messages and see nothing wrong

Steven
06-23-2004, 07:08 PM
EMT-Chris,

You need to get your facts rights, this is the second time you have said do not run http as nobody. The main proccess of apache runs as "root" because it has to grab the port 80 and the other proccess run as nobody, apache, www, etc.

linux-tech
06-23-2004, 08:03 PM
doesn't seen to be a hardware or httpd problem because, as I said, once I reboot the server, the load goes back to normal

Of course it does. All of the connections to httpd have been reset. Most likely it starts up again within 10-20 minutes.
You've got 10 httpd processes running 16-30+m ram, it's at the top of top, there's no question what the offending process is.

EMT-Chris
06-23-2004, 08:29 PM
Originally posted by Haze
Hrm, ok, first of all EMT-Chris:

Swap usage, is not a bad thing at all, especially on unix type systems. When its not being used and hasn't been free'd by the program itself, the information is taken from the memory chip(s) and stored on the hard drive as swap. If its called again, its placed back in memory if an area is free, or sometimes called direct from the hard drive. Swap usage will almost always be in use, its the way it all works.

Lem0nHead:

From the looks of things, i'd think it may be a hardware issue, perhaps a fan not working ( maybe the cpu fan ? ). You may want to get someone to do a check, just in case.

Other than that, Memory usage actually looks ok, never hurts to have more ram, but i think your just fine, ATM.

Are there any strange happenings in the logs? Have you tailed them for a while, just to see if you could pick out anything odd?

Perhaps maybe even try installing a new kernel?


He has 15meg of free memory, and 150meg of swap used. Hes out of memory. Hey, I could be wrong, but thats what I see when I see those numbers.

EMT-Chris
06-23-2004, 08:30 PM
Originally posted by thelinuxguy
EMT-Chris,

You need to get your facts rights, this is the second time you have said do not run http as nobody. The main proccess of apache runs as "root" because it has to grab the port 80 and the other proccess run as nobody, apache, www, etc.

thelinuxguy -

You are correct. And I do appologize, but as you can see, I did correct myself.

Chris

Haze
06-23-2004, 08:50 PM
Originally posted by EMT-Chris
He has 15meg of free memory, and 150meg of swap used. Hes out of memory. Hey, I could be wrong, but thats what I see when I see those numbers.

Have another look:


Mem: 1032556k av, 1017596k used, 14960k free, 0k shrd, 87832k buff
345480k active, 598056k inactive
Swap: 2040212k av, 152552k used, 1887660k free 527232k cached


There is quite a lot of inactive memory, meaning not in use. Could have been memory that was swapped.

We could get a better idea if the user pasted the output of free from the command line.

Here is the output of one of my servers from top compaired to free:
Top:
Mem: 1550140k av, 1495788k used, 54352k free, 0k shrd, 226708k buff
627296k active, 641348k inactive
Swap: 1024088k av, 90540k used, 933548k free 830248k cached

Free:

total used free shared buffers cached
Mem: 1550140 1496164 53976 0 226708 830312
-/+ buffers/cache: 439144 1110996
Swap: 1024088 90540 933548


On this server the above in bold is essentially free or useable ram on this server.

EMT-Chris
06-23-2004, 09:12 PM
I was under the impression that inactive memory is simply memory that has not been writen to in a while. True, it is eligable to be reclaimed and used for other purposes while the current memory gets placed in swap.

Perhaps I am incorrect, but thats the way I have always understud it.

Chris

Lem0nHead
06-23-2004, 09:12 PM
Originally posted by wolfstream
Of course it does. All of the connections to httpd have been reset. Most likely it starts up again within 10-20 minutes.
You've got 10 httpd processes running 16-30+m ram, it's at the top of top, there's no question what the offending process is.

but after a few minutes it would go back to 5~6, and it doesn't happen
when I restart, it keeps 0.2~0.4 for a few days

Lem0nHead
06-23-2004, 09:16 PM
Haze

root@server01 [~]# free
total used free shared buffers cached
Mem: 1032556 942624 89932 0 92756 484452
-/+ buffers/cache: 365416 667140
Swap: 2040212 150124 1890088

seriously
i high doubt that's a memory problem
hehe...

my main guess here is still, as i said, that linux is not computing the load correct
can't that happen?

linux-tech
06-23-2004, 09:25 PM
How long is the server up when you have to reboot it?
It's entirely possible this is an apache/http problem, ie: some user not cleaning memory up properly (not calling mysql_free_result when allocating mysql_fetch_array, etc).
Is your kernel up to date and what version are you using? It's entirely possible that this is an older kernel, as I've seen this there as well.

Haze
06-23-2004, 09:26 PM
Just out of curiosity, can you paste your dmesg ?

Lem0nHead
06-24-2004, 12:19 AM
Originally posted by wolfstream
How long is the server up when you have to reboot it?
It's entirely possible this is an apache/http problem, ie: some user not cleaning memory up properly (not calling mysql_free_result when allocating mysql_fetch_array, etc).
Is your kernel up to date and what version are you using? It's entirely possible that this is an older kernel, as I've seen this there as well.

1) kernel 2.4.26 with grsec
2) see my next post (in a few minutes ;) )

Lem0nHead
06-24-2004, 12:23 AM
ok
now it's getting very interesting

today i runned the backup again
and guess what? the load keep the same (4~6) during the backup and, after the backup, it went down to 0.2~0.4 again!!

01:21:13 up 1 day, 22:38, 1 user, load average: 0.36, 0.62, 1.88
144 processes: 143 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.0% 0.0% 0.0% 33.3% 33.6% 33.0% 0.0%
cpu00 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
cpu01 0.0% 0.0% 0.0% 33.3% 33.4% 33.1% 0.0%
Mem: 1032556k av, 633680k used, 398876k free, 0k shrd, 44372k buff
158652k active, 406140k inactive
Swap: 2040212k av, 173076k used, 1867136k free 257704k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
13191 nobody 18 0 52324 20M 17144 S 0.7 2.0 0:00 1 httpd
26633 nobody 19 0 56692 24M 21372 S 0.5 2.4 0:02 0 httpd
13939 nobody 15 0 52192 20M 17960 S 0.5 2.0 0:00 0 httpd
31898 nobody 15 0 50404 18M 16672 S 0.5 1.8 0:00 1 httpd
29590 root 17 0 1288 1288 868 R 0.4 0.1 0:00 1 top
5723 nobody 16 0 62108 29M 24808 S 0.2 2.9 0:02 1 httpd
12740 nobody 12 0 57524 25M 22768 S 0.2 2.5 0:01 1 httpd
26818 nobody 12 0 46456 14M 13224 S 0.2 1.4 0:00 1 httpd
17676 nobody 11 0 46596 14M 13276 S 0.1 1.4 0:00 1 httpd
1 root 8 0 548 504 480 S 0.0 0.0 0:07 1 init
2 root 9 0 0 0 0 SW 0.0 0.0 0:00 1 keventd
3 root 19 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd_CPU
4 root 18 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd_CPU


i also noticed the RAM drops a lot

root@server01 [~]# free
total used free shared buffers cached
Mem: 1032556 650908 381648 0 45404 268800
-/+ buffers/cache: 336704 695852
Swap: 2040212 173076 1867136

so... what can we conclude for that?
i can't think in anything

Note: the backup is done on another drive (that was umounted after the backuped ended yesterday) using rsync

Edit: just checked MRTG graphic
the memory reduced right after I run the backup...
it wasn't something progressive
very very weird...

Lem0nHead
06-24-2004, 12:24 AM
Originally posted by Haze
Just out of curiosity, can you paste your dmesg ?

uh
it's too big
lots of ** IN_TCP DROP ** and ** IN_UDP DROP **

anyway, read the post above please ;)

Haze
06-24-2004, 12:29 AM
If you have cpanel try running this in SSH as root:
/scripts/smartcheck

Else, install smart ( if not already ) and check those drives out.

Any errors? Do both drives report [OK] ?

Lem0nHead
06-24-2004, 12:32 AM
Originally posted by Haze
If you have cpanel try running this in SSH as root:
/scripts/smartcheck

Else, install smart ( if not already ) and check those drives out.

Any errors? Do both drives report [OK] ?

root@server01 [~]# /scripts/smartcheck
Checking /dev/hda....OK
Checking /dev/hdb....OK

don't know if i should be happy or sad

Steven
06-24-2004, 12:32 AM
You still might be feeling the effects of IOWAIT issues on rhe (im guessing its rhe)


hdparm -Tt /dev/hda

whats it return

Lem0nHead
06-24-2004, 12:34 AM
Originally posted by thelinuxguy
You still might be feeling the effects of IOWAIT issues on rhe (im guessing its rhe)


whats it return

hmm... now the server load is normal, so i don't know if you can conclude something from it:

root@server01 [~]# hdparm -Tt /dev/hda

/dev/hda:
Timing buffer-cache reads: 3392 MB in 2.00 seconds = 1696.00 MB/sec
Timing buffered disk reads: 64 MB in 3.06 seconds = 20.92 MB/sec

root@server01 [~]# hdparm -Tt /dev/hdb

/dev/hdb:
Timing buffer-cache reads: 3136 MB in 2.00 seconds = 1568.00 MB/sec
Timing buffered disk reads: 80 MB in 3.00 seconds = 26.67 MB/sec



should I run when the server is high?

Steven
06-24-2004, 12:44 AM
Lem0nhead, your drive is quite slow


Timing buffered disk reads: 64 MB in 1.27 seconds = 50.39 MB/sec


Ide drive in my server. HT with 2.4 kernel creates a BAD io overhead, slowing the server down.

Lem0nHead
06-24-2004, 12:55 AM
Originally posted by thelinuxguy
Lem0nhead, your drive is quite slow



Ide drive in my server. HT with 2.4 kernel creates a BAD io overhead, slowing the server down.

hmm...
is there some disadvantages of disabling HT?

Steven
06-24-2004, 01:11 AM
HT is supposed to help, but in the case of 2.4 kernels it will slow your server down due to the way 2.4 handles the ht.