Results 1 to 9 of 9
Thread: SuperMicro issues!
-
03-06-2012, 08:40 AM #1Newbie
- Join Date
- Jan 2008
- Location
- Sweden
- Posts
- 11
SuperMicro issues!
Hi WHT members,
I'm not sure this is the correct part of the forum but I hope that someone have a solution or some more info regarding this.
Anyway, we are having issues with our supermicro servers that seems to go into some sort of lockdown mode. When we plug-in a console to the physical server you are presented with a black screen. The console get's a signal but no picture. Tried using both a USB and a PS/2 keyboard to "wake" the screen but with no luck. After a reboot the server will work for some time before the same thing happens again.
At first we thought that it might have been that the servers got overheated but the servers where running at a normal temp.
The logs report nothing and the problem exists on both windows and linux machines. All of the servers affected are supermicro servers so my question is if anyone else have had the same problem? And if so, did you find some sort of solution for it?
Kind Regards,
-
03-06-2012, 03:33 PM #2Junior Guru Wannabe
- Join Date
- Apr 2002
- Posts
- 76
In the “lock down mode” is the server actually hung, or does it continue to serve network requests? Does it work over remote desktop?
Do you have IPMI configured for network access?
What is the state of the various LED indicators on the front panel?
When you enter BIOS, is there anything in the DMI Event Log?
Do you have Supero Doctor up and running? You can configure it to send you alerts if anything goes wrong with temperatures/fans/voltages. You can also configure it to save parameters to a log file every n seconds.
We had a recent thread here about a server freezing; you can look there and in many other similar threads for ideas:
http://www.webhostingtalk.com/showth...ht=SuperServer
-
03-06-2012, 04:04 PM #3
if the server also stops responding on ssh etc, then I would suspect faulty hardware. Usually the motherboard is to blame but could also be the ram. Overheating I would normally check as well but you already checked that.
IOFLOOD.com -- We Love Servers
Phoenix, AZ Dedicated Servers in under an hour
★ Ryzen 9: 7950x3D ★ Dual E5-2680v4 Xeon ★
Contact Us: sales@ioflood.com ★
-
03-06-2012, 04:06 PM #4Newbie
- Join Date
- Jan 2008
- Location
- Sweden
- Posts
- 11
Hi TObject,
Thank you for your reply, i will make sure that our servers will get SuperDoctor installed on them tomorrow.
The server is not answering on anything when in lockdown mode. The indicators show normal activity. Everything seems normal, even the logs, but I will have a look at the DMI Event Log, and setup SuperDoctor tomorrow morning
Edit: We suspected that it was a hardware fault but we've replaced the raid-cards and RAM on some of the servers but the problem returns.
Regards,
Martin
-
03-06-2012, 06:46 PM #5Junior Guru Wannabe
- Join Date
- Apr 2002
- Posts
- 76
Also, write down what motherboard is in those servers. If it is X8STI, then there is a known issue of an extra standoff under the motherboard causing shorts.
-
03-06-2012, 07:09 PM #6Aspiring Evangelist
- Join Date
- Jan 2008
- Location
- Vancouver, BC
- Posts
- 422
We had the same issue last week on a X9SCM-F while upgrading the memory from 8GB to 16GB as per customer request.
Server could run 8GB setup with no problem on kingston or supertalent 4GB sticks but as soon as we upgrade it to 16GB (means adding 2 new sticks of 4GB with identical brand/type of stick) it ran into black screen of death and had to do a hard reset from IPMI or APC to bring it up again.
We first suspected the software issue, went ahead and did some registry edit in windows 2008 r2 and done some pagefile configuration, added SSD drive to make sure its not a random pagefiling issue, no results !
No events appeared in windows events either, no logs to indicate any faulty driver or configuration from our/customer side.
Then we went to IPMI error logs to see if there is any heating or DIMM issue reported, empty queue as a result !
After doing some other steps, ordering new memory chips from different brands, changing the RAID controller and even trying with other OS (on different HDD, was a production server) we finally gave up!
Moved customer HDDs along with RAID card to a new identical server with 16GB RAM and worked fine, no more crashes.
We already requested an RMA from SM, its a bit of pain but definitely worth a clean production environment for next customer.
I suggest the same to you, go for mobo RMA and you'll be fine.
- Aria
-
03-07-2012, 06:08 AM #7Newbie
- Join Date
- Jan 2008
- Location
- Sweden
- Posts
- 11
That sounds 'interessting'. The servers are in production and used by our clients so finding out the motherboard type might take some time.
ASN-Aria:
That's probably the easiest way but it would mean extra downtime for our customers. Hopefully it's possible to solve this easier. And according to our distributor they test all the servers before they send them to us, I don't know what tests they are running. But it's probably just the basic memtests etc.
So if anyone knows about some good tests we could try aswell then let me know
Thank you all for the responses. I will keep you updated on the progress aswell.
Regards,
Martin
-
03-07-2012, 06:39 AM #8Aspiring Evangelist
- Join Date
- Jan 2008
- Location
- Vancouver, BC
- Posts
- 422
Checking the mobo type would be possible using any of the following methods:
1. KVM, reboot the server, press DEL contentiously to bring up the CMOS settings, you'll see the mobo type there usually.
2. KVM, reboot the server and wait until POST screen comes up, you'll see the mobo type there as well.
3. WindowsOS, CPUid software.
4. Linux, there are handful of command lines.
5. Manual way, open the chassis led and you'll see the mobo model next to supermicro logo.
Afterward:
You may order one of the same mobo in order to have onsite spares, that can save you money, by much
Good luck anyways.
- Aria
-
03-07-2012, 06:55 AM #9Newbie
- Join Date
- Jan 2008
- Location
- Sweden
- Posts
- 11
Thank you for the reply. The problem isn't that we can't find out what motherboards are in the server but more the fact that our customers don't want us to issue a reboot on their dedicated servers. We will have a look as soon as a customer lets us or the next time one of them goes down.
Regards,
Martin
Similar Threads
-
SuperMicro
By jp268 in forum Dedicated ServerReplies: 18Last Post: 11-24-2011, 09:25 AM -
SuperMicro or not?
By tulix in forum Colocation, Data Centers, IP Space and NetworksReplies: 20Last Post: 04-27-2009, 11:15 AM -
SuperMicro down
By racked_solutions in forum Providers and Network Outages and UpdatesReplies: 3Last Post: 03-30-2009, 03:52 PM -
Supermicro H8DA8 kernel issues on CentOS5 | HyperVM
By agnivo007 in forum Dedicated ServerReplies: 3Last Post: 10-24-2008, 05:48 AM -
Which supermicro?
By bjseiler in forum Hosting Security and TechnologyReplies: 3Last Post: 10-16-2004, 02:41 PM