Results 1 to 14 of 14
  1. #1
    Join Date
    Dec 2002
    Location
    Sibiu, Romania
    Posts
    229

    Testing ECC Memory on IBM eServer

    Hello,

    I bought a used server now 6 months ago, after I got it I did test it and install ISP Config and other tools on it, everything worked perfect but didn't send it yet to a datacenter, now I wanted to test it again because I wanted to send it but I got few

    I got some errors like this from time to time on the console :

    PHP Code:
    Corectable Errors ... CE k8_edac ... 
    extended error codeECC chipkill x4 error 
    I don't know exactly the errors because I forgot to make a screenshot, but I burned a CD with memtest-86 v3.5a SMP and run the memtest for 8 hours, by default the ECC_Mem was OFF in memtest (in BIOS it's enabled) for the first 8 hours I didn't got any errors, and it reached Pass 5 or 6.

    Now I tried with ECC_Mem: ON and it's been 6 hours since running, reached Pass 5 and no errors so far.

    Can I test the RAM with something else ?

    Before runing the memtest I did removed and inserted the ram again.

    I have 4Gb (8 x 512mb) and dual Opteron 254 proccesors.

    What do you suggest ?

    PS. I did order another 4Gb ECC Ram (4x1Gb) but it will take some time until I will get the package.

  2. #2
    Well, I can tell you for sure that bad contact can result in errors.
    That's 1 of the main problems in second hand servers.

    I'd suggest you clean everything and then preform the tests.
    Tip: when cleaning RAM contacts use white pencil eraser
    Everything else - dust cleaner and paint brush (not used).

    Another thing that can result in errors OR self-reboot is the power supply, and if you decide to clean it - be EXTRA CAREFUL. Even unplugged the PSU can still hold enough charge to do harm !

    If you still have errors AFTER cleaning, try removing the RAM 1 by 1 and test untill you find the faulty.
    Last edited by Onepamopa; 02-17-2011 at 12:43 PM.

  3. #3
    Join Date
    Dec 2002
    Location
    Sibiu, Romania
    Posts
    229
    It just passed 11 hours since memtest is running and couldn't find an error. About the secondhand thing, the server came to me in excelent condition, I was amazed how clean everything looked, they also tested the server before shipp it, I also test it with some software for server stability and I got no problem when I received it, but maybe in the months that the server was in my storage room some dust got in .. don't know what to say realy.

    I think I will stop the memtest now and start the server to see if the errors are shown again on the console.

    Can you tell me what software I can install on the server to test again for stability ? When I got it had windows OS and used win tools, but now I have Linux on it so I need something else.

  4. #4
    Agree with Onepamopa - contacts cleaning usually returns 90% of computers back to life.
    Alnitech.com - dedicated servers for your business
    Dedicated Servers, Disaster Recovery, IaaS & More.
    ✓ 24x7 h/w Support & Server Monitoring
    ✓ 21-day Money-back Guarantee

  5. #5
    You can try some benchmarking tools, check here: http://lbs.sourceforge.net/
    & still it wont be a bad idea to clean it, and measure/test the PSU.
    Last edited by Onepamopa; 02-17-2011 at 08:23 PM.

  6. #6
    Join Date
    Dec 2002
    Location
    Sibiu, Romania
    Posts
    229
    Thank you for that link, I have used UnixBench more than 10 times yesterday, and the day before. So far I did not get any error on the screen.

    I will clean all the contacts again, but don't know how to measure the PSU. I'm very confident that the problem was the dust, as after I have removed and inserted the RAM again I did not get any error .. yet. And it past about 20 hours with memtest runing and now another 12 hours since the server was started, and the only load to it was runing the unixbench multiple times.

  7. #7
    Listen, every PSU (Power supply unit) has several years "life", after that, it could still work, but the voltages start to vary (they have to be CONSTANT). This is caused by bad capacitors. Better find someone who knows what he's doing, and test the PSU UNDER LOAD (that means - the server has to be running, the power supply will needs some load on it in order to be measured properly). Another thing... new PSU for old servers is very difficult to find.

  8. #8
    Join Date
    Dec 2002
    Location
    Sibiu, Romania
    Posts
    229
    Now I'm testing the server with ab tool, I will post few graphs generated by munin, the load,memory, network, temperature graph was generated by the server itself, and the Power Consumption graph was generated by another computer, where the APC was connected (the APC UPS powered only one device, the server I was testing)

    The server on idle, was consuming 160w and in full load ~200w (the PSU is rated at 411w / IBM 39Y7166), and load of 190.55 189.37 184.87 I think is high enough to trigger a problem, I want to test it more to use even more ram, I don't know why used memory was only ~1500 of 4000MB available, and in the memory graph generated by munin there is a commited value, of 6 - 8 Gb.

    This server will be mostly used as a disc space for backups generated by a cPanel server, and probably just a few personal websites.

    Bellow I will post some summary of the ab tool

    Code:
    # ab -kc 500 -n 10000 http://www.mydomain.com/de/categories/metal-ds-range-fence-1.html
    This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
    Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
    Copyright 2006 The Apache Software Foundation, http://www.apache.org/
    
    Benchmarking www.mydomain.com (be patient)
    Completed 1000 requests
    Completed 2000 requests
    Completed 3000 requests
    Completed 4000 requests
    Completed 5000 requests
    Completed 6000 requests
    apr_socket_recv: Connection reset by peer (104)
    Total of 6005 requests completed
    [[email protected] ~]# ab -kc 500 -n 5000 http://www.mydomain.com/de/categories/metal-ds-range-fence-1.html
    This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
    Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
    Copyright 2006 The Apache Software Foundation, http://www.apache.org/
    
    Benchmarking www.mydomain.com (be patient)
    Completed 500 requests
    Completed 1000 requests
    Completed 1500 requests
    Completed 2000 requests
    Completed 2500 requests
    Completed 3000 requests
    Completed 3500 requests
    Completed 4000 requests
    Completed 4500 requests
    Finished 5000 requests
    
    
    Server Software:        Apache/2.2.3
    Server Hostname:        www.mydomain.com
    Server Port:            80
    
    Document Path:          /de/categories/metal-ds-range-fence-1.html
    Document Length:        0 bytes
    
    Concurrency Level:      500
    Time taken for tests:   177.873843 seconds
    Complete requests:      5000
    Failed requests:        399
       (Connect: 0, Length: 399, Exceptions: 0)
    Write errors:           0
    Non-2xx responses:      4601
    Keep-Alive requests:    0
    Total transferred:      12150581 bytes
    HTML transferred:       9514534 bytes
    Requests per second:    28.11 [#/sec] (mean)
    Time per request:       17787.384 [ms] (mean)
    Time per request:       35.575 [ms] (mean, across all concurrent requests)
    Transfer rate:          66.70 [Kbytes/sec] received
    
    Connection Times (ms)
                  min  mean[+/-sd] median   max
    Connect:        0 1360 5209.5      0   93009
    Processing:    67 14082 18858.3   5437  142115
    Waiting:       66 13685 18371.9   5416  140154
    Total:       1627 15443 19229.6   6512  142204
    
    Percentage of the requests served within a certain time (ms)
      50%   6512
      66%  10324
      75%  17504
      80%  25570
      90%  43273
      95%  50695
      98%  83150
      99%  105686
     100%  142204 (longest request)
    The last spike you see in the network graph is because I used ab tool accessing an jpg file of 700kb using 10000 concurent connections :

    Code:
    # ab -kn 10000 -n 5000 http://www.mydomain.com/images/afis.jpg
    Tonight I will let it run till the morning with about 300 concurent connections to a php file, 300 it seems it's safe as I didn't got any error "Connection reset by peer" with less than 400 conncurent.

    Any other suggestions are welcome, before sending the server to datacenter.
    Attached Thumbnails Attached Thumbnails backup.xtremewd.com-memory-day.png   backup.xtremewd.com-load-day.png   power_consumption_vs_htop_full_load.jpg   backup.xtremewd.com-sensors_temp-day.png   backup.xtremewd.com-if_eth0-day.png  


  9. #9
    find someone what can work with measurement tools (voltmetter) and knows power supply units. it's a simple matter of measuring the voltage during load to see if there's a problem or not. Long time ago I had a PC with a PSU that outputed 4.05V (instead of 5.00V) and 11.00V (instead of 12V)... it worked.... for a time (it eventually died out cause the PSU shorted and fried the motherboard and everything...) better safe than sorry, right ?

  10. #10
    Join Date
    Dec 2002
    Location
    Sibiu, Romania
    Posts
    229
    I have a multimeter, and tried to test the PSU, also made some pictures to see how I did it.

    12+ line was constant at 12,31v never changed, and tested all the yellow wires about 1 minute, each.

    5+ line was not so stable, but the fluctuation was very very small, maybe you can tell me if this might be an issue because 3 of the red wires had 5,04 - 5,03v , and one was fixed on the 5,04v - this was the only difference I could see, so 0,01v could indicate a problem ?

    There is another connector with less wires, that comes from the PSU, but I could not test that too because there is no free space to insert the voltmeter
    Attached Thumbnails Attached Thumbnails 2011.02.19 PSU Testing 004_.jpg   2011.02.19 PSU Testing 005_.jpg   2011.02.19 PSU Testing 007_.jpg  

  11. #11
    Join Date
    Jan 2011
    Location
    India
    Posts
    1,446
    I have also a query regarding used Server.
    Is it safe to buy used server. Like many people I also like to buy used server because we can get it at cheaper rate.

  12. #12
    Join Date
    Dec 2002
    Location
    Sibiu, Romania
    Posts
    229
    This was my first used server, but never got in to production so far, I want to make sure it it ok to send it to a datacenter, where will be mostly used only for backups.

    I think with used equipments you have to accept the risks, as you don't know when something may fail.

  13. #13
    There are some risks when buying used servers, cause you never know in what condition they were (how they were used etc) before u got them. The best thing is to have everything tested before buying it, but that's not always possible.

    @ovisopa seems like your PSU is in good working condition. I'd recomend cleaning it, but be careful !!! After that, it's safe to send it to a datacenter and start your work

  14. #14
    Join Date
    Dec 2002
    Location
    Sibiu, Romania
    Posts
    229
    To clean the PSU I have only one option, to blow air in it, I tried to open the PSU and there is one screw under a sticker, probably it's like that for warranty, the warranty expired long ago, but I didn't want to brake the "seal" only to open the PSU and visually check and clean it.

    Thank you for all you advices.

    Have a nice weekend.

Similar Threads

  1. Rails required for IBM eserver 1U ?
    By asisco in forum Colocation and Data Centers
    Replies: 4
    Last Post: 10-23-2010, 02:04 AM
  2. 2u IBM x345 eServer $99 BIN
    By BA-Corey in forum Other Web Hosting Related Offers
    Replies: 2
    Last Post: 02-14-2010, 10:19 PM
  3. FS: IBM eServer
    By Red-Rocket in forum Other Offers & Requests
    Replies: 1
    Last Post: 09-14-2009, 01:02 AM
  4. Raid Controller for IBM eServer x325
    By FHDave in forum Hosting Security and Technology
    Replies: 0
    Last Post: 08-05-2006, 09:53 AM
  5. DI new IBM eServer xSeries 330
    By Steve_S in forum Dedicated Server
    Replies: 0
    Last Post: 05-22-2001, 04:44 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •