I have a server with a Megaraid 320-1 SCSI RAID controller, with 6 seagate 10K.6 drives. This server experiences intermittent crashes due to MCE (machine check exception) and kernel OOPS. Sometimes it just hangs with no messages in the log.
We ran memtest86 v3.1 for about 2 hours and it came back clean. I had the guys at my datacenter go over my logs looking for discrepancies...but nothing. No software config errors either. They seem to think that there is some kind of incompatability with the LSI Megaraid controller and Linux.
Megaraid is the OEM for some of Dell's PERC controllers, so I would figure that Megaraid SHOULD NOT have any issues with Redhat...otherwise Dell would probably not use them as an OEM. I'd like to know if anyone else can shed some light on this for me. Thanks.
I actually have custom built hitachi U320 scsi cables. Originally, cables were causing a problem because I was using rounded scsi cables...but I quickly realized that and got these.
The funny thing is, there doesn't seem to be a correlation to disk activity and the crashes. I'm thinking it's just a random glitch that pops up. I am using beta drivers provided to me by LSI...they were supposed to fix some I/O timeout issues.
Are you suggesting I try Dell's drivers for this card for better results?
Well. AACRAID/PERCRAID drivers are included in linux kernel sources 2.4.x>. There was a gentleman by the name of Matt Domsche I believe who developed these drivers for use on Dell machines because he discovered that support for Linux on Dell machines, well, sucked. I'm sorry that I can not provide any more info, it's been quite a while since I've dealt with this hardware. That name is key, however - Matt Domsche. Heck, if you email him, he'll probably respond. Good luck.
I was just looking into this, actually. I'm trying out some crap Highpoint setup which does RAID1/0, but it sucks. It's software RAID emulated off of the card's chip. It's not true RAID.
I've been looking into some things that 3ware has to offer, and it looks promising. I don't have a lot of details on their setups right now, but I will post here when I find something. Come back in a couple days, I'd be more than happy to share my learnings.