X7DBE WTF

Several years ago I got two Supermicro X7DBE boards at a bargain price. These are nice dual Socket 771 boards of circa 2007 vintage, built around the Intel 5000P Blackford chipset and using FB-DIMMs with up to 32GB memory supported.

Recently I pulled one of the boards out of storage and installed two quad-core Harpertown Xeon E5450 processors in it, primarily for the purpose of verifying that newly arrived FB-DIMMs work. The memory worked just fine… but something else didn’t.

The CPUs ran at only 1.2 GHz (instead of 3.0 GHz), and even worse, one of the CPUs was not recognized. I swapped the CPUs around but that didn’t change anything—one was still not recognized. I plugged in two slightly slower Xeon E5430 CPUs… and one still wasn’t recognized, and instead of 2.66 GHz the CPUs ran at 1.06 GHz. I thought the board perhaps got damaged during moving and didn’t investigate further.

Sometime later I pulled out the other X7DBE board. It had two E5430 Xeons installed already. When I first tried it, both CPUs were recognized, running at 2.66 GHz as they should. Board clearly working.

So I thought, let’s put the faster E5450s in it. I did… and one of them wasn’t recognized, while the other ran at 1.2 GHz! I put the original E5430s back, but oops, one wasn’t recognized and the other ran at 1.06 GHz.

Okay, this is really weird. In desperation, I put in two Dempsey 5080 Xeons, the latest and greatest (if such a thing can be said) NetBurst Xeons. And again, one of them was not recognized, although the other at least ran at the full 3.73 GHz. Hmm…

For good measure I also tried two dual-core Woodcrest 5110 Xeons, but it was just more of the same. Only one CPU recognized, running at 800 MHz instead of 1.6 GHz. I tried all I could think of—resetting the CMOS (more than once), tweaking various BIOS settings, but nothing helped. Besides memory, the board was stripped to the bare minimum—onboard video and keyboard, nothing else. Nothing made any difference.

As always, this sort of problem has been seen before, with no resolution.

Further Probing

A few days later I ran additional tests. First of all I tried a different PSU, but that made absolutely no difference. I should note that the PSU I mostly used with the X7DBE boards has no trouble powering a dual Opteron board.

Next I did a bit of math to figure out the strange frequencies. The result was pretty clear: The CPUs run using the proper multiplier but with a 133 MHz BCLK (which is technically not even supported by these CPUs at all!). Thus a 3.0 GHz CPU which normally runs with a 333 MHz BCLK and 9.0 multiplier becomes a 1.2 GHz CPU with 133 MHz BCLK and the same multiplier. A 2.66 GHz CPU is the same thing only with an 8.0 multiplier, and becomes a 1.06 GHz processor with 133 MHz BCLK. In the last example, a 1.6 GHz Xeon 5110 normally uses a 266 MHz BCLK and a 6.0 multiplier, therefore dropping to exactly half the speed with 133 MHz BCLK.

Okay, so maybe the board has trouble reading the CPU signals identifying the desired BCLK frequency. I looked up which pins are used for that and carefully checked all four sockets (two on each board). But I found no problems, all pins were straight with no sign of damage or dirt. Checked the CPUs as well, found no problems either.

I also tried removing memory (those FB-DIMMs are a bit power hungry) but again, there was no difference.

Then I started wondering what would happen with just a single CPU. And that finally got me somewhere.

I established that on both boards, the first socket (labeled CPU1) simply won’t work. That is to say, if the board is powered up with just CPU1 populated, the CPU gets warm(!) but the power indicator LED on the board doesn’t light up and there is no sign of life from the board except for the fans spinning.

Except about twice, the LED did light up… and nothing further happened.

On the other hand, if only the second socket (CPU2) is populated, the board’s power LED lights up instantly, the system POSTs and boots, and the CPU runs at the correct frequency.

The same thing happens with both of the X7DBE boards and all the CPUs I tried. Xeon 5050, Xeon 5080, Xeon 5110, Xeon E5430, Xeon E5450. I did not find any broken CPU, they all worked in the CPU2 socket. Only the first socket on both boards refuses cooperation.

Note that the board documentation says nothing about running with a single CPU. But there is evidence that it’s somewhat commonly done, and good evidence that on this particular board model, a lone CPU in the first socket should work.

Answer, or a Question?

The conclusion, such as it is, appears to be that both boards have some damage on the first socket. How did both boards get damaged in a short time window in exactly the same way is the real mystery, since there is no visible damage on either of the boards.

How that happened with no apparent damage to any of the CPUs is a further mystery. Given that there is no visible mechanical or electrical damage, perhaps the problem is micro-cracks in the boards or something of that nature. Whatever it is does not prevent the CPU in the first socket from drawing power (since it gets warm) but does prevent it from being detected.

Whatever the cause of the problem is, it’s very vexing and annoyingly mysterious.

This entry was posted in PC hardware, Supermicro, Xeon. Bookmark the permalink.

18 Responses to X7DBE WTF

  1. Chris M. says:

    Could be that wonderful lead free RoHS compliant solder in action. Did these boards work fine before they were put away for storage?

  2. Michal Necasek says:

    Yes, they did, but at least one of them also worked fine right before it didn’t.

  3. Mark Hughes says:

    If you don’t have a pressing need for the boards you could make an attempt at reflowing them in your oven. Might work, Might kill them…

  4. Michal Necasek says:

    I might try that, although I would sort of like to know beforehand if it has any chance of helping. The fact that two boards developed the exact same problem really makes me wonder.

  5. rasz_pl says:

    >other X7DBE board. It had two E5430 Xeons installed already. When I first tried it, both CPUs were recognized

    It sure sounds like putting E5450 did something to first socket on both boards 😮
    I would start by
    – measuring Vcore/Vtt on both sockets with board running in bad state with both CPUs mounted. CPU can warm up even with one of those missing. You will have to trace source of those voltages.
    – tracing BSEL pins on both sockets to the clock chips.

    133MHz vs 266MHz suggests something is pulling BSEL0 pin high on first socket.

  6. Michal Necasek says:

    It does sound that way, but that would be the first time I’ve seen that a CPU that appears to be perfectly fine blew up a board. I’m also not sure why the CPU would damage one socket but not the other (the first thing I tried was swapping the CPUs around).

  7. rasz_pl says:

    Might be design weakness of the first socket. 775 CPU socket testers are ~$10, ebay “771 775 Tester Desktop Motherboard CPU Socket Analyzer Card Dummy Load with LED”.

    ps: RSS feed didnt update for this post.

  8. Dale Smoker says:

    Per RSS, I likewise wasn’t notified of this post until the 8/1 post was made. This has been happening for some time; being alerted for /two/ new posts from OS/2 Museum.

    I’m using Feedbro for Firefox.

  9. zeurkous says:

    Same RSS problem here. Using a little monitoring script that pulls /feed
    every hour and compares it against an existing copy (alerting me and
    replacing said copy when it changes).

  10. zeurkous says:

    /feed -> /wp/feed

  11. Michal Necasek says:

    There is some funny caching problem on the server side, most of the time RSS picks up the changes but sometimes it doesn’t. So far I just clear the cache every now and then, if it becomes a real problem I’ll have to dig deeper.

  12. Michal Necasek says:

    I got myself one of those, unfortunately with instructions only in Chinese. All LEDs light up in all the sockets, though I’m not really sure what that tells me. I could not find any detailed explanation of what these testers actually test. Does anyone know?

    I also see the tester has pads labeled VTTPG, PWRGOOD, CLK0/1, BCLK0/1, RESET, and VTT. What can one do with those?

  13. rasz_pl says:

    BCLK0/1 is probably BSEL0/1, but where is BSEL2 on the tester? 😮
    download 771 pinout, or look up pictures attached to reddit thread “Help me OC LGA 771 Xeon by BSEL mod PLEASE!” and determine if those pins (G29, H30, G30) are connected on both sockets to something, or if they are linked witch each other in some way maybe? This might be difficult, with tracks from the socket going right into middle layers of pcb and ending up under some BGA chip 🙁

    From your description first socket lost connection to one of those pins somehow.

  14. rasz_pl says:

    To expand on my previous clues. The difference between 133 and 266 FSB is BSEL0 pulled low vs pulled high. By default with no cpu this pin is pulled high to VTT by ~470 ohm resistor. You booting 266HZ FSB processors in first socket downclocked to 133MHz suggests BSEL0 connection was either severed due to bend pin, cracked track, cracked solder joint, or straight up shorted to VTT (IO supply voltage) by either bend LGA pin or blown component on the board. Sadly I have no access to Dual CPU 771 schematics to look up typical arrangements (do they go by first active socket? lowest common denominator? buffer both together? straight join them and dont care since you shouldnt put different cpus anyway? no idea), only low res pictures on the web showing ICS branded clock generator and maybe a clock buffering chip, cant read designations.

    Btw If you mail me Ill send you a link to a treasure trove of motherboard diagrams, going all the way back to 1997 Socket 7 430TX/HX/VX days. Im guessing posting links in comments will mark it as spam.

  15. Michal Necasek says:

    Try posting links, if it’s not completely eaten by a spam filter then I can approve the post.

  16. rasz_pl says:

    Over 6GB of boardviews and diagrams. For you most interesting stuff probably sits here: https://schematic-x.blogspot.com/2018/04/more-schematics.html

    For example an weird quirk I found regarding 430TX/HX/VX chipsets. Northbridges (82439TX/82439HX) require separate jumper pulling Address Bus line 27 high _only_ when running at 60MHz FSB :o. 50/55/66MHz are all fine with this pin dangling in the air (or pulled low according to datasheets). A27 is used as a boot setup strap configuring initial DRAM Refresh Rate register value, but the chipset is seemingly only concerned with 60MHz clock scenario, treating all other options, both higher and lower, equally?!? Normally one would expect 66MHz being the odd case. Pretty weird.

  17. Michal Necasek says:

    Cool, thanks! FWIW, the comment came through without me doing anything. I think one link from a known poster is likely to pass the filters, when there’s a bunch of links it gets iffy, or when it’s someone who never posted an approved comment before.

    That is sure a nice collection of schematics, I like the Intel ones especially, this will be useful. Thanks!

  18. rasz_pl says:

    Asus, MSI, ECS boards go down to K7 (KT133)/P4 (845)
    MSI being masters of crap and oem even have some P4 SIS SDRAM and 850 RDRAM (MS-6504) models in there.
    Also other end of the spectrum MSI K7D Master (MS-6501), its oem Tyan Thunder K7 (MS-6502), both AMD-760 dual socket A. MSI 860D Pro (MS-6508) dual Xeon socket 603.
    Gigabyte has a couple socket370 CopperMine/Tualatin i915/VIA.
    IBM has ThinkPad 600X (in openboardview supported file format).

    Im very interested in any diagrams/boardviews/datasheets for pre 1998 hardware.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.