Results 1 to 30 of 30
  1. #1
    Join Date
    Jan 2011
    Posts
    127

    How often do CPUs actually fail?

    How often do the actual CPU itself fail after use (meaning non DOA)?

    Have anyone ran into any issues with a CPU failing and if so, how much use or how old was the CPU before the failure occured?

  2. #2
    Join Date
    Nov 2004
    Location
    Chicago
    Posts
    413
    What are you trying to determine?
    Lee Evans, Owner/Operator
    LeeWare Development
    Linux Dedicated Server Grids
    http://www.leeware.com

  3. #3
    Join Date
    Jan 2011
    Posts
    127
    I have a pair of x5650 processors but I cannot claim warranty for them, no longer covered by Intel's 3 year warranty. So curiosity just sparked and made me wonder how often do CPUs actually run into any issues that require RMA.

  4. #4
    Join Date
    May 2006
    Location
    San Francisco
    Posts
    7,200
    I'd imagine it's quite rare from anecdotal evidence.

  5. #5
    Join Date
    Mar 2003
    Location
    chicago
    Posts
    1,557
    i have only seen a few fried cpu's over the years and they had fans that broke down and the boards did not shut off in time.

    its been years tho since i have had a cpu fail.

  6. #6
    Join Date
    Feb 2008
    Location
    Houston, Texas, USA
    Posts
    2,955
    Quote Originally Posted by celona View Post
    I have a pair of x5650 processors but I cannot claim warranty for them, no longer covered by Intel's 3 year warranty.
    Are you sure warranty is out? Those came out in 2010. You still have about two years left.

    Regards
    Joe
    UNIXy - Fully Managed Servers and Clusters - Established in 2006
    [ cPanel Varnish Nginx Plugin ] - Enhance LiteSpeed and Apache Performance
    www.unixy.net - Los Angeles | Houston | Atlanta | Rotterdam
    Love to help pro bono (time permitting). joe > unixy.net

  7. #7
    Join Date
    Nov 2004
    Location
    Chicago
    Posts
    413
    From my experience 1/1000 for CPU failures. Memory, CPU-FANs, PSU Motherboards, HDD are more common (in the order listed)
    Lee Evans, Owner/Operator
    LeeWare Development
    Linux Dedicated Server Grids
    http://www.leeware.com

  8. #8
    Join Date
    Jun 2002
    Location
    Waco, TX
    Posts
    5,292
    I've had some memory controllers on AMD chips fail over time, but still quite rare. (I've had more intel memory controller on the motherboard fail however, so it isn't an 'amd issue' just the AMDs have been on die of CPU for a lot longer)

    The new AMD and Intels both I've had fans fail and the CPUs are fine as they have the built in throttles to save themselves from self destructing.

  9. #9
    Join Date
    Jul 2008
    Location
    Dallas, TX
    Posts
    107
    Out of every CPU we've gotten at Limestone we have never had to RMA a single one.

  10. #10
    Join Date
    May 2006
    Location
    NJ, USA
    Posts
    6,456
    I have seen a Core2Duo be defective after 6mos where it only showed 1 core. It was strange.
    simplywww: directadmin and cpanel hosting that will rock your socks
    Need some work done in a datacenter in the NYC area? NYC Remote Hands can do it.

    Follow my "deals" Twitter for hardware specials.. @dougysdeals

  11. #11
    Join Date
    Jan 2003
    Location
    Chicago, IL
    Posts
    6,889
    It is a bit more common with the memory controllers on the CPU now, where you'll get odd memory issues that are then fixed by a CPU swap, though they're still pretty rare. I'd say 1 in every 700 or so would be defective at some point in their usable lifetime (generally about 5 years). That puts it down to about 0.03% annual failure rate, while hard drives are closer to a 5% annual failure rate.
    Karl Zimmerman - Steadfast: Managed Dedicated Servers and Premium Colocation
    karl @ steadfast.net - Sales/Support: 312-602-2689
    Cloud Hosting, Managed Dedicated Servers, Chicago Colocation, and New Jersey Colocation
    Now Open in New Jersey! - Contact us for New Jersey colocation or dedicated servers

  12. #12
    with no over clocking i see quad core showing 2 processor less than actual.
    The data center's quality guidelines for HDD maintenance should be kept in a very strict level.

  13. #13
    If you're not overclocking, I would say, almost never. When they do fail, it's usually because of a complete lack of cooling. I've seen one where the person owned cats and had the computer on the floor. The entire heatsink was clogged with cat dander. Much to my surprise, the cpu actually needed replacing. Repairing computers for a year, that was the only actual cpu failure I saw. The only other times I've seen it is if you run the cpu without a heatsink

    Hard drive and motherboard failures are easily 20 times as likely, if not 100 times as much.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  14. #14
    Join Date
    Aug 2006
    Location
    Ashburn VA, San Diego CA
    Posts
    4,571
    Honestly I've never confirmed a CPU failure... when things are still wonky after replacing RAM and testing the power supply, I swapped mainboard and CPU just for safe measure...I will still guess that most of the issues were caused by the mainboard and not the CPU.
    Fast Serv Networks, LLC | AS29889 | Fully Managed Cloud, Streaming, Dedicated Servers, Colo by-the-U
    Since 2003 - Ashburn VA + San Diego CA Datacenters

  15. #15
    Quote Originally Posted by FastServ View Post
    Honestly I've never confirmed a CPU failure... when things are still wonky after replacing RAM and testing the power supply, I swapped mainboard and CPU just for safe measure...I will still guess that most of the issues were caused by the mainboard and not the CPU.
    I've only ever had the need to replace the cpu if it fried from overheating, or if it was overclocked. Hard drive issues are easy enough to figure out without blaming any other components. Power supplies in my experience either work or they don't, so that's pretty easy to troubleshoot as well.

    If you ever have a problem where the cause isn't obvious (random crashing, intermittant issues), it's almost always the motherboard or the ram (in that order of liklihood). I really couldn't see replacing the cpu unless you had a good reason to suspect it, but replacing the motherboard should probably be your first measure if you can't figure out what's wrong with a system.

    edit: If you're working on your systems remotely, do keep in mind that a bad fan might be to blame if you have crashes after a machine has been booted for a short while. You'd be surprised what datacenter techs won't notice when troubleshooting your system. A failed case or cpu fan can cause issues you might normally attribute to other parts if someone wasn't thorough enough to notice them.
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  16. #16
    Join Date
    Nov 2010
    Posts
    181
    i believe CPUs have a long time life. but RAM ! aahhhhhhh
    Articles and news about Dinosaurs and dinosaur forum.

  17. #17
    Quote Originally Posted by sashaiel View Post
    i believe CPUs have a long time life. but RAM ! aahhhhhhh
    Ram is usually pretty good as well, though I wouldn't be shocked to see it fail like I would a CPU. Have a few hundred gigs of ram running without any failures (knock on wood)
    Phoenix Dedicated Servers -- IOFLOOD.com
    Email: sales [at] ioflood.com
    Skype: iofloodsales
    Backup Storage VPS -- 1TBVPS.com

  18. #18
    Join Date
    Apr 2009
    Location
    Dallas/FortWorth TX
    Posts
    1,677
    In my personal experience, the last CPU I remember that got burned down was a "Intel 486"
    IPStrada When uptime counts.
    Warren Buffet: Honesty is very expensive gift do not expect it from cheap people.

  19. #19
    Join Date
    Jan 2011
    Posts
    127
    Thanks guys, I think I feel a lot better about these x5650s now

  20. #20
    Join Date
    Nov 2002
    Location
    San Diego, CA
    Posts
    504
    Out of about 30000 (yes thirty thousand) cpus we've used over the last 10 years, we've had about 4 failures. Out of the probably 40-50K pieces of ram, we have had about 200 bad sticks.

    Moreover we've had about 400-500 bad hdd's.

    Rough numbers, but gives you an idea. CPU's only fail if something else fails (IE Cpu fan, power supply overload).

  21. #21
    Join Date
    Mar 2010
    Location
    Upstate New York
    Posts
    1,446
    Have never experienced CPU failure but am wondering if MTBF would be information provided by the manufacturer. Have you tried asking them?
    I can't say it's information they'd be willing to share; more likely independent web sites will provide averages.
    John Rasri
    Private Label Live Chat Provider For Resellers
    GotLiveChat.com
    White Label/Brand-able live chat software solutions

  22. #22
    Join Date
    Oct 2009
    Posts
    856
    The only fried CPU I've experienced over the years was one that literally caught fire. Well, not the CPU itself, but the power connector that decided to spontaneously melt was right below it.

  23. #23
    Join Date
    Dec 2005
    Posts
    3,077
    I've seen a lot of failed Pentium 4 CPUs lately which are a good 5/6 years old so I would say around 5 years is a decent estimate.

    Depends on workload of course + how cool they are kept.

  24. #24
    Join Date
    Jan 2003
    Location
    Chicago, IL
    Posts
    6,889
    We've seen more "fail" with various memory bugs/issues, etc. than a hard failure. I don't recall seeing ANY that have just had a complete hard failure out of the thousands we have.
    Karl Zimmerman - Steadfast: Managed Dedicated Servers and Premium Colocation
    karl @ steadfast.net - Sales/Support: 312-602-2689
    Cloud Hosting, Managed Dedicated Servers, Chicago Colocation, and New Jersey Colocation
    Now Open in New Jersey! - Contact us for New Jersey colocation or dedicated servers

  25. #25
    Join Date
    Mar 2010
    Location
    Germany
    Posts
    681
    I almost agree with the one in thousand estimate.
    You'll notice when you get errors in /var/log/mcelog, for example. (mainboard, memory or cpu errors)

    At $oldjob we had about one cpu replacement per year on the "real unix boxes" (including when the error wasn't really traceable and "mainboard" + cpus were swapped. When they switched to proliant blades it got a little more common, like 5-6 per year. Compared to i-dont-know-how-many memory failures it was like a rounding error.
    Check out my SSD guides for Samsung, HGST (Hitachi Global Storage) and Intel!

  26. #26
    Join Date
    Jul 2007
    Location
    Ashburn, VA
    Posts
    1,314
    I've only seen CPUs fail when they've been significantly overclocked. Other than that, I've never had a CPU fail.
    Preetam Jinka

    Isomerous - High performance web services for business and individuals.
    Bitcable Colocation, KVMs, cPanel hosting, Oracle expertise, and more.

  27. #27
    Join Date
    Apr 2009
    Location
    USA / UK
    Posts
    4,553
    I've never seen a CPU die just out of the blue without any direct cause (ie. from "old age").

    Usually CPU's die when they overheat (although that is somewhat rare since most systems have overheat protection and will shut themselves off before permanent damage occurs), get power surged (surge supressor/ups/genset to prevent that), or in rarer cases when the motherboard goes poof it takes the cpu with it (don't use cheap motherboards!).
    RAM Host -- Premium & Budget Linux Hosting From The USA & EU
    █ Featuring Powerful cPanel CloudLinux Shared Hosting
    █ & Cheap Premium Virtual Dedicated Servers
    Follow us on Twitter

  28. #28
    Join Date
    Jun 2005
    Posts
    2,574
    Quote Originally Posted by funkywizard View Post
    Ram is usually pretty good as well, though I wouldn't be shocked to see it fail like I would a CPU. Have a few hundred gigs of ram running without any failures (knock on wood)
    Knock on wood and don't read this:

    DRAM Errors in the Wild: A Large-Scale Field Study

    excerpt

    This paper provides the first large-scale study of DRAM
    memory errors in the field. It is based on data collected
    from Google’s server fleet over a period of more than two
    years making up many millions of DIMM days. The DRAM
    in our study covers multiple vendors, DRAM densities and
    technologies (DDR1, DDR2, and FBDIMM).

    The paper addresses the following questions: How com-
    mon are memory errors in practice? What are their statis-
    tical properties? How are they affected by external factors,
    such as temperature, and system utilization? And how do
    they vary with chip-specific factors, such as chip density,
    memory technology and DIMM age?
    You will only find out how good a provider is when the going gets tough

  29. #29
    Join Date
    Jan 2011
    Posts
    127
    What about tray processors? Might tray processors have a higher failure rate than boxed ones? Is there any reason why there's only a 1 year warranty behind tray processors and a 3 year warranty on retail boxed?

  30. #30
    Join Date
    Apr 2009
    Location
    USA / UK
    Posts
    4,553
    Quote Originally Posted by celona View Post
    What about tray processors? Might tray processors have a higher failure rate than boxed ones? Is there any reason why there's only a 1 year warranty behind tray processors and a 3 year warranty on retail boxed?
    I suspect most on here are using OEM processors.

    IMO no difference between retail box vs oem.

Similar Threads

  1. Intel Core 2 Quad C2Q Q6700 CPUs and Q9300 CPUs for sale. Toronto
    By vpsfusion in forum Other Web Hosting Related Offers
    Replies: 4
    Last Post: 05-04-2009, 12:52 PM
  2. :fail: vs :fail: no such address here
    By mealto in forum VPS Hosting
    Replies: 8
    Last Post: 10-16-2006, 05:50 PM
  3. Two CPUs VS one CPU
    By spot_gr in forum Hosting Security and Technology
    Replies: 1
    Last Post: 01-03-2006, 06:06 AM
  4. to fail or not to fail that is the question...
    By jcrespi in forum Hosting Software and Control Panels
    Replies: 3
    Last Post: 08-03-2005, 11:18 PM
  5. CPUs.
    By alucasa in forum Web Hosting Lounge
    Replies: 5
    Last Post: 10-27-2001, 04:51 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •