Web Hosting Talk







View Full Version : server issues (new server online)


AlaskanWolf
12-29-2001, 06:11 PM
server is running linux 7.1 with newly installed cpanel

server was from what i can guess pretty stable before cpanel was installed (i am GUESSING since it was only at NOC for 2 days prior to installing cpanel, for those 2 days, server never crashed at all)

last few nights, server has "frozen" in the middle of the night, all i get is the blicker, hard rebooting of course gives the fsck errors how you have to run it manually

logs dont give any clues on why it froze either, the same thing happened when layer1 of cpanel was done, it was at cleaning up temp files and froze there......

I also reverted back to the orginal kernel 2.4.2-2smp and going to run that for 24 hours to see if when i wake up, the server is not responding, if its not, then maybe hardware issue?

some things i pucked from the logs


Dec 28 16:46:31 localhost kernel: swap_dup: Bad swap file entry 00000050 (alot of them)
Dec 28 16:47:52 localhost kernel: <1>Unable to handle kernel paging request at virtual address db12e894
Dec 28 16:47:52 localhost kernel: printing eip:
Dec 28 17:25:57 localhost kernel: EXT2-fs error (device sd(8,5)): ext2_free_blocks: Freeing blocks in system zones - Block = 18, count = 1

(FYI: server crash occured DEC 29th @ 4:30am...so those above are from previous day)

ADD: Reloaded 2.4-2.2smp kernel after booting, no errors in /var/log/messages......

The Prohacker
12-29-2001, 07:14 PM
What kernel version did you update too?

AlaskanWolf
12-29-2001, 07:16 PM
2.4.9

The Prohacker
12-29-2001, 07:33 PM
Have you tried 2.4.17?

AlaskanWolf
12-29-2001, 08:45 PM
no not yet, if its the kernel, i guess i will know when i wake up tomorrow to see if the server is still responding...

i just ran memtest and though the first pass, everything seems normal

Run 1:
Test 1: Stuck Address: Testing...Passed.
Test 2: Random value: Setting...Testing...Passed.
Test 3: XOR comparison: Setting...Testing...Passed.
Test 4: SUB comparison: Setting...Testing...Passed.
Test 5: MUL comparison: Setting...Testing...Passed.
Test 6: DIV comparison: Setting...Testing...Passed.
Test 7: OR comparison: Setting...Testing...Passed.
Test 8: AND comparison: Setting...Testing...Passed.
Test 9: Sequential Increment: Setting...Testing...Passed.
Test 10: Solid Bits: Testing...Passed.
Test 11: Block Sequential: Testing...Passed.
Test 12: Checkerboard: Testing...Passed.
Test 13: Bit Spread: Testing...Passed.
Test 14: Bit Flip: Testing...Passed.
Test 15: Walking Ones: Testing...Passed.
Test 16: Walking Zeroes: Testing...Passed.
Run 1 completed in 4965 seconds (0 tests showed errors).

Gernot
12-29-2001, 09:28 PM
Your kernel oopsed which isn't good at all. Run ksymoops on your /var/log/messages file and save the results somewhere. It's always good as it gives you an impression where the error occured.
My suggestion is to:
a) Upgrade to 2.4.17 asap and then
b) Run a few disk benchmarks like bonnie or dbench to check whether the kernel has fixed the problem. Your errors look very much like either a kernel bug in your SCSI controller or a hardware failure (either the disk or the SCSI controller).

AlaskanWolf
12-30-2001, 12:38 AM
I had HE hookup a monitor to it, and he said this showed up on the screen


Unable to handle kernel NULL pointer dereference at virtual address
00000008
process sshd(pid: 1164 stackpage=d7eeb000)


then he said it started dumping alot of processes, and from the looks of the messages, he rebooted the server