Web Hosting Talk







View Full Version : Runaway HTTPD Processes


driverdave
06-02-2002, 12:06 AM
I am experiencing runaway HTTPD processes that hog 100% of my CPU (celeron). They don't cripple the machine, but they do peg the CPU to 100%.

They started out of the blue after about 45 days of uptime. Started happening daily for about 2 days, then about every few hours. I am not experiencing any abnormal traffic to the server.

I mostly run PHP(4.1.2/w zend optimizer) code, with lots of MySQL interaction on top of a RedHat 7.1 Ensim setup.

The processes are un-killable via kill PID. They only way to get rid of them is to stop/start apache twice. They linger after I stop httpd the first time. Then I start again, stop and they go away.

I checked out my error log for httpd. I really have no idea whats going on with the log, maybe someone can clue me in. I see those processes that ate up all my CPU (PIDs 524 and 575). That was the only place those PIDs were found in the log.

[Fri May 31 14:43:09 2002] [error] [client 66.196.65.11] File does not exist: /var/www/html/robots.txt
[Fri May 31 16:02:49 2002] [notice] child pid 29759 exit signal Segmentation fault (11)
[Fri May 31 16:15:18 2002] [error] [client 66.196.65.18] File does not exist: /var/www/html/robots.txt
[Fri May 31 16:43:10 2002] [notice] child pid 29279 exit signal Segmentation fault (11)
[Fri May 31 16:50:28 2002] [error] [client 64.8.1.222] File does not exist: /var/www/html/multimedia/videos/
[Fri May 31 16:54:02 2002] [notice] child pid 462 exit signal Segmentation fault (11)
[Fri May 31 17:10:50 2002] [error] [client 216.247.70.237] File does not exist: /var/www/html/request/failed/index_failed.htm
[Fri May 31 17:30:41 2002] [warn] child process 524 still did not exit, sending a SIGTERM
[Fri May 31 17:30:41 2002] [warn] child process 575 still did not exit, sending a SIGTERM
[Fri May 31 17:32:46 2002] [notice] bandwidth monitoring enabled (mapping file: /etc/virtualhosting/mappings/apache.domainmap)

Does anyone know of a way to trace those PIDs back to the request that spawned them? I've tried strace -p PID#, but it just hangs.

I can only think of some script getting stuck in a loop, but that shouldn't peg my CPU to 100%, they should die. Not that I write lots of infinite loops :)

If anyone has ever seen this before and can give me any advice, it would be greatly appreciated.

I've searched all over google and found lots of things about different apache mods causing runaway processes, but nothing that seemed to fit my scenerio.

I've attached a MRTG %CPU usage graph i case anyone is interested.

MGCJerry
06-02-2002, 12:19 AM
I dont know much about *nix webservers, but it looks like the processes are "Seg faulting" which could be the problem.

Personally, I was never impressed with the celeron chips. They are good for average users, but me personally would never use one for a webserver or heavy operations. I ran on a Celeron 700, and it went to 100% usage on just opening a new window, it was terrible. When an app on my computer segfaults, it takes my CPU to 100% till I kill it. But I'm also running Windows 2k Pro.

I'm not an expert on servers, so my post will most likely mean absolutely nothing to you. Plus I really cant offer any advice here, I'll let another guru tell you :)

:beer:

wave
06-02-2002, 01:37 PM
Use "top" to find out the pid's of those hogging processes. Then you can do a ps -ef to get the ppid of those processes. How are you killing them? Did you use the -9 flag? Treating the symptoms won't help... you need a cure. :)

driverdave
06-02-2002, 03:05 PM
Yes, using top, I see 2-4 httpd processes each using their respective amount of the CPU. Like, if there are 3 runaways, they'll each be using about 32% of the CPU. Memory usage for the processes is not abnormal, 1% maybe. The just hog the CPU.

I try to kill them by typing 'kill PID#' with no flags set. I rarely have to kill processes, so I'm not to familiar with the flags. But like I said, I can't kill them that way. The only thing that makes them go away is 2 restarts on httpd.

Next time it crops up, I'll try the ps -ef on the PID. I'm assuming that PPID is the parent PID that I have to kill to kill it's runaway children?

Foutunately/un-fourtunately a system re-boot seemed to clear this problem up for now. A rather poor solution :)

What I'd really like to know is a way to get the httpd request that caused the runaway. I'm assuming that is the first step in figuring out the cause of this.

wave
06-02-2002, 03:57 PM
You can only kill processes with "kill pid" if they don't catch TERM signals. I think that is why you weren't able to kill those hogs. Using "kill -9 pid" will (most likely) kill them.

"ppid" is short for parent pid. Without knowing the problem, I can't say whether killing the parent will help. For instance the parent of an orphan process is already dead.

The first thing I'd do is look for a pattern from raw log entries preceding Fri May 31 16:02:49, 16:43:10, 16:54:02 and 17:30:41. For example, a common request or process that ran before each of those times. Then try to duplicate those calls and check for zombies/orphans, etc. Good luck!

ffeingol
06-02-2002, 04:10 PM
driverdave,

Do you have MaxRequestsPerChild set to something other than zero?

I was having the exact same problem on my server. One of the Apahce chile processes would die with a segmentation fault and then chew up a lot of the CPU. I could not do a graceful restart at that point, only a restart (i.e. stop/start).

The bad new is that I have not found a solution to the problem.

Frank

driverdave
06-02-2002, 06:29 PM
Frank, MaxRequestsPerChild is commented out. I haven't tweaked my setup at all, it's just a stock Ensim build. I'm not even sure what to set this to.

Wave, I think that may be my only way to track down those requests. Thanks for the tip.

driverdave
06-11-2002, 07:25 PM
Although I don't wan't to proclaim victory too early, I think I've solved my problem.

For some reason, I couldn't put 2 and 2 together to realize that my problems started shortly after I ugraded PHP with the 4.1.2 rpms (RedHat 7.1) from http://ensim.gamesquad.net/ .

I compiled PHP 4.2.1 from php.net, and I've been problem free since.

I just wanted to close my own thread in case anyone else runs into this :)

xerocity.com
06-11-2002, 07:39 PM
Just a side note:

in top while it is running you could just hit "k" and it will then prompt you for the proccess ID (which can be optained from the left column), enter the process ID then press enter and it will then kill the process.

p.s. this is on redhat linux 7.2, I don't know if it is like this on other *nix OS's