Results 1 to 11 of 11
  1. #1
    Join Date
    Jul 2002
    Location
    Italy
    Posts
    344

    Server down every 3 hours...

    One of my servers goes down every 3 hours.

    In /var/log/messages I've found the following message:

    Nov 29 05:00:25 viper kernel: kernel BUG at include/asm/spinlock.h:133!
    Nov 29 05:00:25 viper kernel: invalid operand: 0000 [#1]
    Nov 29 05:00:25 viper kernel: SMP
    Nov 29 05:00:25 viper kernel: Modules linked in: loop ipt_owner md5 ipv6 ipt_TOS iptable_mangle ip_conntrack_ftp ip_conntrack_
    irc ipt_REJECT ipt_LOG ipt_limit iptable_filter ipt_multiport ipt_state ip_conntrack ip_tables autofs4 sunrpc dm_mirror dm_mod
    button battery ac uhci_hcd ehci_hcd hw_random e1000 floppy ext3 jbd raid1 sata_promise libata sd_mod scsi_mod
    Nov 29 05:00:25 viper kernel: CPU: 2
    Nov 29 05:00:25 viper kernel: EIP: 0060:[<c02cfc61>] Not tainted VLI
    Nov 29 05:00:25 viper kernel: EFLAGS: 00010216 (2.6.9-22.ELsmp)
    Nov 29 05:00:25 viper kernel: EIP is at _spin_lock+0x1c/0x34
    Nov 29 05:00:25 viper kernel: eax: c02e36f9 ebx: ec7bc3fc ecx: f5db6dd8 edx: c014c9b3
    Nov 29 05:00:25 viper kernel: esi: 00000000 edi: 00000000 ebp: ec7bc3d0 esp: f5db6ddc
    Nov 29 05:00:25 viper kernel: ds: 007b es: 007b ss: 0068
    Nov 29 05:00:25 viper kernel: Process cpanellogd (pid: 4651, threadinfo=f5db6000 task=f742e3b0)
    Nov 29 05:00:25 viper kernel: Stack: 00000000 c014c9b3 00000000 00000000 00000000 00000000 00000000 c02cf6e3
    Nov 29 05:00:25 viper kernel: f6cf5a80 c015bf99 00001000 f7e10200 f5b95c18 f7e1329c 00000003 00000000
    Nov 29 05:00:25 viper kernel: 00000000 00000000 ffffffff 00000001 00000000 00000000 00000246 00000000

    Any suggestions?

    Thanks!

  2. #2
    Join Date
    Mar 2003
    Location
    California USA
    Posts
    13,294
    loosk to me like a kernel oops. My suggestion is upgrading to the latest version by source.
    Steven Ciaburri | Industry's Best Server Management - Rack911.com
    Software Auditing - 400+ Vulnerabilities Found - Quote @ https://www.RACK911Labs.com
    Fully Managed Dedicated Servers (Las Vegas, New York City, & Amsterdam) (AS62710)
    FreeBSD & Linux Server Management, Security Auditing, Server Optimization, PCI Compliance

  3. #3
    Join Date
    May 2004
    Location
    San Diego, CA USA
    Posts
    55
    What distro ? What version ?

    Have you turned off the unneeded crap like pcmcia, isdn, etc. ? Are you all patched up and running the latest packaged kernel foryour distro ?

    I Googled it, and it seems that you are not the only one that has had this problem.

    I dont' agree with Steven's suggestion to roll your own kernel from source. Stay with the packaged one, but get the latest if you aren't already using it.

    What did you do before this started happening ? What has changed since the time it wasnt' doing this ? Software upgrade, patches, config change ? What ?

  4. #4
    Join Date
    Mar 2003
    Location
    California USA
    Posts
    13,294
    Agree with my suggestion or not. I run into these all the time. I have found the default redhat 4 /centos 4 kernels are extremely buggy.

    Nov 29 05:00:25 viper kernel: EFLAGS: 00010216 (2.6.9-22.ELsmp)
    He is running centos 4 or redhat enterprise 4.

    Kernel errors like this happen all the time, its no wonder you seen this error on google alot . I recently seen one because of buggy scsi drivers in the default redhat kernels. They were changed in the later versions of the 2.6 kernel but not in the rpm kernel.
    Steven Ciaburri | Industry's Best Server Management - Rack911.com
    Software Auditing - 400+ Vulnerabilities Found - Quote @ https://www.RACK911Labs.com
    Fully Managed Dedicated Servers (Las Vegas, New York City, & Amsterdam) (AS62710)
    FreeBSD & Linux Server Management, Security Auditing, Server Optimization, PCI Compliance

  5. #5
    Join Date
    Mar 2004
    Posts
    295
    M5Host, what is wrong with Steven's suggestion?
    I personally feel that doing your own kernel as per the machine specs and only of those instead of a pre built one via rpm is a step everyone sould do once they are capable and have the time.

  6. #6
    Join Date
    May 2004
    Location
    San Diego, CA USA
    Posts
    55
    Quote Originally Posted by TDK-Kevin
    M5Host, what is wrong with Steven's suggestion?
    I personally feel that doing your own kernel as per the machine specs and only of those instead of a pre built one via rpm is a step everyone sould do once they are capable and have the time.
    Of course you can roll your own, I don't disagree with doing that at all. I was only thinking that when troubleshooting it's better to go with a known quantity rather than possibly introducing changes that may or may not obscure the problem or add to the confusion. The question is what has change since the system stopped working correctly.

    Not to mention, I haven't any idea what software he's running, or if he knows how to recompile his kernel. If he's running Oracle for example, then it's pretty particular about the kernel. If it's just a geek box, then go ahead, knock yourself out... compile all day long. But, let's start with the basics first. Let's hear more details.

    ...but your right.... compiling a kernel for your box can be great for some.

  7. #7
    Join Date
    Mar 2004
    Posts
    295
    I understand where you are coming from now good point.

  8. #8
    Join Date
    Jul 2002
    Location
    Italy
    Posts
    344
    I've just upgraded the kernel to the 2.6.9-22.0.1.ELsmp version with rpms .
    Let's see if it happens again but it's weird as nothing has changed in the last month and it never went down before, just in the last 24 hours only... and yes, I've removed all the unneeded packages and services...

    It's Centos 4.2. I'll post here the results as someone could experience the same problem.

    Thank you.

  9. #9
    Can it be a DDoS ?

    My server also facing similar problem. Server get down 4 or 5 times today evening. After reboot, everything is ok, load is below 1, i will login to server and run "top" after some time it will go off, nothing unusual in "top" at last DC told it is DDoS and null routed the IP that is getting attack. Now server seems working fine.

  10. #10
    Most times a DDOS attack will be a constant strain on the server, but you never know unless you check.
    Eleven2 Web Hosting - World-Wide Hosting, Done Right!

  11. #11
    Join Date
    Jul 2002
    Location
    Italy
    Posts
    344
    It seems the kernel update fixed the issue...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •