
05-30-2005, 07:20 PM
|
|
View Beta Profile
Broken once again
|
|
Join Date: Sep 2002
Location: Cornfield, Iowa
Posts: 6,396
|
|
Howto: Automatically verify crond is running and restart if not
This one's pretty simple really, but given the fact that I'm constantly running into crond stops with CPanel servers (no clue why, or where, but it's happening), I'll post this here. If it helps someone, then great, if not, hey, no worries 
Firstly, login as root to your server through ssh, however you usually do .
Next, the script (we'll call it /bin/croncheck.sh). Use whatever editor you choose to create the file.
Code:
#!/bin/bash
DATE=`date "+%m%d%y [%k:%M]"`
LOGDATE=`date "+%m-%d-%y [%k:%M:%S]"`
mail=/bin/mail
sysadmin=YOUREMAILHERE
cronfile=/tmp/cron.txt
logfile=/var/log/croncheck.log
echo "$LOGDATE - Cron Check Service starting up " >> $logfile
function stopcron
{
/sbin/service crond stop
}
function startcron
{
DATE=`date "+%m%d%y [%k:%M]"`
LOGDATE=`date "+%m-%d-%y [%k:%M:%S]"`
stopcron
echo "Cron Service Down, attempting to restart on $DATE" >> $cronfile
cat $cronfile | $mail -s 'Cron Restart' $sysadmin
/sbin/service crond start
rm $cronfile
}
function checkpid
{
DATE=`date "+%m%d%y [%k:%M]"`
LOGDATE=`date "+%m-%d-%y [%k:%M:%S]"`
cronfile=/var/run/crond.pid
#get the pid, if it exists
if [ ! -f $cronfile ];then
echo "$LOGDATE - Dead PID" >> $logfile
startcron
fi
thispid=`cat $cronfile`
if [ ! -d /proc/$thispid ];then
#startcron
echo "$LOGDATE - Cron Stopped. Restarting" >> $logfile
startcron
else
echo "$LOGDATE - Normal Cron Running" >> $logfile
fi
}
function quit {
exit
}
function hello {
echo $1
}
COUNTER=0
while [ $COUNTER -lt 10 ]; do
checkpid
sleep 30s
done
Simple, right? It actually is. Make sure the script is executable
Code:
chmod u+rxw /bin/croncheck.sh
And add this to /etc/rc.local
Code:
/bin/sh /usr/bin/nohup /bin/croncheck.sh &
To run without having to restart the server, simply type
Code:
/bin/sh /usr/bin/nohup /bin/croncheck.sh >> /dev/null &
and you're set.
This code will email you upon failure, and attempt to start the crond service. Occasionally it'll not be able to do so, but at least you'll know when it's not able to do so and you can restart it yourself
Enjoy
Last edited by linux-tech; 05-30-2005 at 07:24 PM.
|

05-30-2005, 07:49 PM
|
|
|
what about this scenario:
between 2 runs of checkpid, cron gets stopped, some other process gets started, due to some randomization features (random pids), that new process gets the old pid from our cron process, the one in /var/run/crond.pid, i guess the script will fail then, right?
maybe check if /proc/pid/exe points to /usr/sbin/cron or if cmdline does match "/usr/sbin/cron" ?
|

05-30-2005, 08:04 PM
|
|
View Beta Profile
Broken once again
|
|
Join Date: Sep 2002
Location: Cornfield, Iowa
Posts: 6,396
|
|
The likelyhood of a pid being reused in the (literal) milliseconds it takes from getting pid to getting link is about 0.000001%
|

05-31-2005, 02:45 PM
|
|
|
i am talking about this part
Code:
while [ $COUNTER -lt 10 ]; do
checkpid
sleep 30s
and not this one
Code:
thispid=`cat $cronfile`
if [ ! -d /proc/$thispid ];then
during that 30s sleep, crond might get killed, the pid file still has its old pid, cause it didnt get wiped, /var/run/crond.pid does still exist, 30s is plenty of time to start a few new processes, one of those might be running with the old crond pid
|

05-31-2005, 02:50 PM
|
|
View Beta Profile
Broken once again
|
|
Join Date: Sep 2002
Location: Cornfield, Iowa
Posts: 6,396
|
|
Again, this isn't an issue.
checkpid does everything in and of itself. There is no 30s delay between getting the pid and verifying the pid. It's milliseconds at best.
checkpid is called every 30s, and checkpid verifies IN THAT INSTANCE that the pid is valid, not 30s later.
|

05-31-2005, 06:54 PM
|
|
|
so lets see,
its 11:00:00 and we run the script, checkpid finds valid crond pid (1234) in /var/run/crond.pid and /proc/1234 does exist too, so we sleep 30sec now till next checkpid run
11:00:05, crond gets killed, not a clean shutdown so /var/run/crond.pid does still exist and its content is 1234
11:00:10 we start process XYZ, somehow it gets pid 1234
11:00:30 checkpid runs again, checks still existing /var/run/crond.pid, reads crond pid from it (=1234) and checks if /proc/1234 does exist, since new process runs with pid 1234 the /proc dir does exist, checkpid assumes crond is still running while its not, just a new process running with old crond pid
not an issue?
|

06-13-2005, 03:41 PM
|
|
View Beta Profile
Junior Guru
|
|
Join Date: Apr 2005
Location: /usr/share/zoneinfo/EST5EDT
Posts: 246
|
|
Quote:
Originally posted by sehe
not an issue?
|
You can always add:
Code:
grep -q cron /proc/$thispid/cmdline
if [ $? != 0 ]
then
echo "Other process"
fi
|

07-18-2005, 07:50 PM
|
|
View Beta Profile
Broken once again
|
|
Join Date: Sep 2002
Location: Cornfield, Iowa
Posts: 6,396
|
|
Ok, modified it up a bit (was sending out a TON of spam previously), new code is here
Code:
#automated Crond check script. Will restart Cron when it's down
#script runs every 30 seconds, checks for pid and pid file, then if not found
#restarts cron
#!/bin/bash
DATE=`date "+%m%d%y [%k:%M]"`
LOGDATE=`date "+%m-%d-%y [%k:%M:%S]"`
mail=/bin/mail
sysadmin=you@yourdomain.com
cronfile=/tmp/cron.txt
logfile=/var/log/croncheck.log
echo "$LOGDATE - Cron Check Service starting up " >> $logfile
function stopcron
{
/sbin/service crond stop
}
function startcron
{
DATE=`date "+%m%d%y [%k:%M]"`
LOGDATE=`date "+%m-%d-%y [%k:%M:%S]"`
stopcron
echo "Cron Service Down, attempting to restart on $DATE" >> $cronfile
cat $cronfile | $mail -s 'Cron Restart' $sysadmin
/sbin/service crond stop
/sbin/service crond start
rm $cronfile
}
function checkpid
{
DATE=`date "+%m%d%y [%k:%M]"`
LOGDATE=`date "+%m-%d-%y [%k:%M:%S]"`
# ps xua | grep -q crond
/sbin/service crond status |grep "is running..."
if [ $? != 0 ]
then
echo "$LOGDATE - No Cron Running" >> $logfile
startcron
else
echo "$LOGDATE - Normal Cron Running" >> $logfile
fi
}
function quit {
function quit {
exit
}
COUNTER=0
while [ $COUNTER -lt 10 ]; do
checkpid
sleep 30s
done
This also solves the PID issue. So far this hasn't let me down, and it's restarted it a couple times, so I know it works 
|

11-07-2006, 09:25 PM
|
|
|
so when the server is restarted, this script would start itself too?
|

11-19-2006, 02:47 PM
|
|
|
Quote:
|
Originally Posted by conanqtran
so when the server is restarted, this script would start itself too?
|
if you add the line in the original post to /etc/rc.local then yes, it'll get started on reboot 
|

12-05-2006, 08:37 AM
|
|
|
so it works fine? or anyone have problem with that code?
|

01-09-2007, 09:13 AM
|
|
View Beta Profile
Broken once again
|
|
Join Date: Sep 2002
Location: Cornfield, Iowa
Posts: 6,396
|
|
Quote:
|
Originally Posted by LowAsYou
so it works fine? or anyone have problem with that code?
|
This works just fine
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
| Postbit Selector |
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|