Memory Trouble in Stormville

The OS/2 Museum recently acquired a genuine Intel DX79SR (Stormville) board. Together with its close siblings DX79SI (Siler) and DX79TO (Thorsby), these were the last “great” Intel motherboards, supporting the big LGA 2011 socket for the Sandy Bridge E platform—but not Ivy Bridge, because Intel treated buyers of its final boards rather poorly and refused to update the board firmware to support Ivy Bridge E CPUs.

The DX79SR is extremely similar to the older DX79SI which it replaced in Intel’s lineup. The only noteworthy differences are that the Stormville adds two additional rear USB 3.0 ports and two internal 6Gbps SATA ports (through an onboard Marvell SATA controller).

A detail shot (voltage regulator heatsink) of an Intel DX79SR destkop board, 2012.
Intel DX79SR board detail

At $299 (price at May 2012 introduction), the DX79SR was a rather pricey board for rather pricey CPUs. Why would anyone want one? Because it was the only way to get a desktop board (from Intel) supporting an Intel CPU with more than four cores and with support for more than 32GB RAM. All “standard” desktop boards for Sandy Bridge and Ivy Bridge platforms (and even for Haswell in fact) were limited to four cores and 32GB RAM.

It is also noteworthy that the board supports not only Sandy Bridge E but also Sandy Bridge EP processors, and can thus run with not just the six-core i7-branded CPUs but also 6-core or even 8-core LGA2011 Xeons, such as the beefy eight-core E5-2687W.

In my testing, the DX79SR coupled with an i7-3930K is an impressive performer, albeit a real power guzzler. The six-core CPU is rated at 3.2 GHz base frequency and 3.8 GHz turbo, but it easily overclocks to 4.6 GHz turbo with air cooling. In multi-threaded workloads, the old Sandy Bridge E can still easily keep up with today’s quad-core CPUs.

That’s all well and good. Unfortunately, getting more than 32 GB (or at first even 32GB) going in the Stormville board turned out to be quite difficult.

When I first set up the board, I used just two memory modules. That worked without any trouble. Three modules worked fine as well, and improved memory performance–the i7-3930K has a quad-channel memory controller and benefits from having three rather than two memory channels filled. And it benefits even more from populating all four channels. Or maybe it would.

With four modules, things just wouldn’t get off the ground. The board would power off and restart again, and the only hint was POST code b0 hex (probably that) shown on the board’s LED display.

So I started experimenting. The board has two sets of DIMM slots, blue and black. The manual only says that blue slots need to be populated first, starting from slot 1. Experimentation showed that the manual does not tell the whole truth; the actual requirement is that a black slot cannot be populated unless the corresponding blue slot is also populated.

It is, for example, possible to populate slots 1 and 5 (blue and black), although for performance reasons it’s much better to populate slots 1 and 2.

What the manual also does not explain is that blue slots need not be strictly populated starting from the first one. It is perfectly possible to only populate slot 1, or only slot 2, or only slot 3.

DIMM slot 4 turned out to be the troublemaker. No matter what I did, as soon as anything was in slot 4, the board failed to POST. I tried a number of different modules and their combinations, and quickly established that a module that works perfectly fine in slot 1, 2, or 3 still fails in slot 4.

A detail of Intel DX79SR board box.
In a word… nope

I concluded that the fourth slot must be bad, even though visual inspection of the board did not reveal any hint of damage and the board appears to have been well cared for.

But of course I had to do a bit of research first and determine if this was a known problem. And sure enough, it was, or at least I was not the first to run into it. Back in 2013, someone concluded that DIMM slots 4 and 8 (the fourth memory channel) on his board were defective. Got a replacement the board… same problem. Got another replacement board… still the same problem.

It appears that for some mysterious reason, the DX79SR board is extremely picky about the memory in the fourth memory channel. It is clearly not a problem with the memory being defective, the same modules work perfectly in other slots on the same board!

I tried a lot of different memory modules. Corsair, Kingston, Crucial, Samsung, G.Skill… and they all refused to work in slot 4. Soon enough I realized that it’s actually not that hard to put 48GB RAM in the board: The key is populating slots 1-3 and 5-7, while leaving slots 4 and 8 alone. That at least is a clear improvement over regular Sandy Bridge (and Ivy Bridge, and Haswell) boards.

A selection of memory modules that work perfectly in memory channels A-C of a DX79SR board but fail in channel D.
No go in channel 4

At the same time, there are reports of people who did successfully run these boards fully populated with 64GB RAM. What were they doing differently? Hard to say.

For good measure, I tried a different CPU in case that might make any difference, but it didn’t (the reports are of i7-3930K both working and not working with fully populated memory in a DX79SR board).

I even tried downgrading the BIOS (and nearly bricked the board when upgrading the board again) but that made no real difference. The board was failing slightly differently with older BIOS (it took noticeably longer for the board to shut off when the 4th memory channel was populated) but nothing really changed.

For the sake of completeness, I should note that although it ostensibly the fourth memory channel that’s causing trouble (Intel names them A/B/C/D, and D refuses to work), it is actually the second integrated memory controller (IMC). The IMCs are numbered 0-3 and from performance statistics it’s clear that IMC1 is unpopulated. Moreover, when reading DIMM SPD data it’s slots 2/3 that aren’t populated, which again corresponds to IMC1. I have no explanation for this numbering discrepancy; I do not think it has any real bearing on the trouble with the fourth memory channel.

USB Fun

The board also had a rather interesting problem with one of its USB controllers. The USB 3.0 controller driving the rear ports (a Renesas μPD720201) only worked sometimes—that is to say, quite often the PCI device simply was not there at all. But sometimes it was!

The problem appears to have gone away after the PSU was replaced (an older 450W PSU was replaced by a brand new 750W PSU). The vanishing USB controller was the only problem that could likely be traced to the PSU, and the 450W PSU happily works with many other boards.

The USB controllers on the DX79SR board have plenty of their own known issues, which Intel attempted to solve through USB firmware updates (firmware for the two USB 3.0 controllers, entirely separate from the BIOS). A vanishing USB controller does not appear to be a known problem though.

Back to the Memory

I would be thankful for any tips on how to get the 4th memory channel going in the DX79SR board. There is a chance the slot/channel is simply bad, but I consider that quite unlikely, in part because the previous owner of the board confirmed that the actual board did run with fully populated memory.

My suspicion is that the BIOS is somehow responsible. The matrix of BIOS settings on this board is quite unusually complex and the behavior of the BIOS does not inspire any confidence; changing one setting is prone to changing several other settings, and at the same time certain settings persist much longer than one would expect (e.g. overclocking settings for i7-3930K remained after an i7-3820 was installed, yet with an i7-3820 it was not actually possible to arrive at those settings). The short of it is that the BIOS settings are so complex that I could easily be missing something or doing something “wrong” without realizing it. I just don’t know what, and I’m extremely (if perhaps unreasonably) bothered by not being able to use the board to its full potential.

This entry was posted in Bugs, Intel, PC hardware, PC history. Bookmark the permalink.

14 Responses to Memory Trouble in Stormville

  1. Yuhong Bao says:

    I wonder if putting the board in a different case might help.

  2. Michal Necasek says:

    Unfortunately useless because it does not explain why the same module works in three channels but not the fourth. There is nothing wrong with the memory modules, and I tried at least 10 different types of varying speeds and sizes. All fail in the fourth channel in exactly the same way. I even tried Crucial memory that Crucial explicitly lists as compatible with the DX79SR and it behaved just like all the others… works in channels A-C, not in channel D. But why?

  3. Michal Necasek says:

    It’s not in a case… so probably not. But why do you think it might?

  4. Zir Blazer says:

    About the USB Controller thing, this can be tangentially related and you may want to keep it bookmarked, just in case that you want to use Linux (Note that the developer says that it is for the upd720202 but it is extremely closely related to the upd720201, albeit code changes may be required):
    https://mjott.de/blog/881-renesas-usb-3-0-controllers-vs-linux/
    https://github.com/markusj/upd72020x-load
    https://github.com/Ntemis/renesas-fw
    https://old.reddit.com/r/VFIO/comments/f5tabh/renesas_usb_controller_issue_no_superspeed/

    For the memory issues, have you tried forcing settings instead of letting SPD autodetection to do its thing, THEN shutting down and inserting the fourth module? I would try setting the lowest possible speed (Should be 400 MHz for DDR3), manual Timmings and Voltage but at SPD provided settings just to see if it manages to POST.

  5. bhtooefr says:

    Are you using DDR3L or DDR3 RAM? There’s some conflicting info in the RAM specs, whether it’s 1.35 or 1.5 volts… and DDR3L will tolerate 1.5 volts, so might be worth cranking the voltage up to 1.5 if it’s not already there. 1.65 shouldn’t be necessary, but…

    Take the speed down to 1600 if it isn’t already – anything faster is XMP land, and XMP is overclocking.

    Worth looking at the timings to see if you can get stability there, too. If command rate is 1T, raise it to 2T. Set the other timings to 11-11-11-28 if you’re not there already, or 11-11-11-30 otherwise (slower than that should be completely unnecessary, 11-11-11-30 is about as slow as DDR3-1600 would be expected to go).

    …of course there’s the possibility that you’d set up these timings and voltages, and then put the extra RAM in and it re-reads the SPD data and overwrites your settings…

  6. Michal Necasek says:

    Yes, the uPD720202 and uPD720201 are effectively identical, the difference is only the number of ports (four on the ‘201 and two on ‘202). The controller actually works fine under Linux, it has an EEPROM that provides the firmware.

    Yes, I tried automatic and manual memory settings. I am not entirely sure that it makes any difference because the BIOS clearly detects that the module is present and may switch to defaults. The POST code is unfortunately not documented and I have no idea if the BIOS can’t read the SPD, refuses to configure the memory, or if the memory fails to work.

    I tried DDR3-800, various timings, various slot populations, none of that made any difference at all. Although it is entirely possible that I’m missing something specific.

    FWIW, most of my modules are DDR3-1600 and default to that speed. The G.Skill modules in the picture default to DDR3-1333 and need to be manually configured to run faster.

  7. Michal Necasek says:

    Most of my modules are plain DDR3, I also tried DDR3L with no discernible difference. The board documentation claims it supports only 1.35V modules but I see no issue with regular 1.5V modules, and the BIOS is happy to use 1.65V if that’s what XMP indicates.

    I have done minimal experiments with memory faster than DDR3-1600, since 8GB modules faster than DDR3-1600 are kind of expensive even now. I tried slowing things down but I could try more I guess (the BIOS has way too many settings for my liking really).

    The last bit is exactly my fear, that the BIOS detects the new module and throws out the existing config. Hmm, I might be able to figure out whether it does that or not when using the other slots.

  8. I wonder if putting the board in a different case might help.

    Probably that won’t be the case 😀

  9. zeurkous says:

    Lame pun.

  10. Chris M. says:

    It might have been tried already, but…. have you tried clearing the CMOS?

    -Pull the battery or use the clear CMOS jumper.
    -On powerup, go into the BIOS setup and Load Defaults.

    My cousin has an old X79 based Lenovo ThinkStation that had all kinds of weird problems until I did the above. The most notable was no video between booting the OS and when Windows initialized the video card driver. Some parameter in the CMOS was knocking out the video card’s output when it was being driven by the on-board ROM, so generic VGA and VESA modes outputted nothing.

  11. Michal Necasek says:

    Yeah tried resetting the CMOS completely.

    The machine also has a configuration jumper and a Back to BIOS button which both boot the machine with defaults. None of that makes any difference.

  12. Jake says:

    Try re-socketing the cpu and using a different cooler.

    The 2011 socket had some issues with making contact with all pins and requires pretty brutal contact pressure to make it good. might also be a damaged pin in the socket.

    Lastly it could be a dead IMC, do you have another cpu (or board) to cross check with?

    I have 8 total (2 kits) of KHX1600C9D3/4G in my DX79SR with a 3960x and it runs fine on XMP and posts without issues.

  13. Michal Necasek says:

    See my follow-up post Return to Stormville. With more boards and CPUs I was able to establish that the CPU is fine and the memory is definitely fine (I knew that already, really). There was a bent pin in the socket, but because it was on the edge of the socket, it was surprisingly difficult to spot.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.