Solaris 2.5.1 and 2.6 crashes on modern Intel CPUs

I recently found that Solaris 2.6 and 2.5.1 does not work when run in a VM on a modern Intel CPU (Sandy Bridge generation Core i7), or to be exact fails most of the time (about nine times out of ten) when nested paging is used. The symptom is Solaris hanging or rebooting immediately after the kernel is loaded, even before the kernel banner is printed. When nested paging (or hardware virtualization) isn’t used, there’s no problem.

After managing to boot Solaris 2.6 on a physical Core i7 system (not so easy, since a boot floppy plus CD is required!), it turned out that the exact same thing happens, and this is thus not a virtualization issue. But what’s going on there? And why would nested paging fail when the older, slower virtualization methods work? Thanks to the debugging capabilities built into Solaris, it’s possible to answer most of those questions.

One of the nice things about Solaris is that it has had a usable kernel debugger for a long time, and unlike most operating systems, the kernel debugger is always there and in fact available even on the installation CD. That makes it possible to debug a problem like this without even installing Solaris.

All one needs to do is enter b kadb -d on the boot prompt. That will cause kadb (the kernel debugger, kernel adb) to be loaded; the -d switch causes kadb to stop at the earliest opportunity, so that the user has a chance to set breakpoints etc.

It should be noted that kadb is derived from adb, and thus uses syntax which can be only described as “interesting”.

Cause of the crash

The crash occurs very early during system initialization. The first thing the Solaris kernel does is detect the CPU type, and if support for 4MB pages is available (Pentium/Pentium Pro class CPUs), that support is enabled. This is where things go wrong.

The Solaris kernel hits a page fault, which is intercepted and handled by the kernel. Unfortunately, the fault handler triggers another exception, perhaps because the kernel isn’t really set up yet, causing a cascade of faults. The stack eventually overflows and ends up overwriting the GDT, which is stored in memory just below the stack. That is fatal, because the trap handling code reloads segment registers a lot, and with a corrupted GDT, an exception cannot be dispatched. OS death ensues.

The ultimate cause is the first trap in the cycle. To enable 4MB pages, Solaris needs to modify the CR4 register, which must be done with paging disabled. To that end, Solaris creates an identity mapping (physical address equals virtual address) for a single page which holds the routine modifying CR4. The routine turns off paging in CR0, updates CR4, turns paging on again, and returns to the caller. The (indirect) call to the routine causes the crash.

A bit of work with kadb and knowledge of the x86 architecture makes it possible to determine that the indirect call instruction causes a page fault because the destination page is not present. Yet using the debugger to examine the supposedly non-present page shows that it’s very much there. It’s also safe assume that when Solaris 2.6 was released, it did not crash like that. What’s going on there?

Changed semantics and living dangerously

The true cause of the crash is not easy to determine. What is known is that Solaris 2.6 happily works on Pentium II class machines, as well as AMD Phenom CPUs. It may be only the Intel Core i5/i7 CPUs it has trouble with.

It’s fairly clear that one strong contributing factor to the crash is that Solaris updates the page tables but does not invalidate the TLB (Translation Lookaside Buffer) for the updated page. That’s living dangerously, but by itself shouldn’t cause problems. If an existing page mapping were changed without invalidating the TLB, that would be asking for trouble. However, in this case a previously non-present page is made present, and in that case the CPU can’t store anything in the TLB. Therefore when the page is referenced, the CPU should traverse the page tables in memory and find the expected mapping.

That’s clearly how things worked back in 1996-1997 when Solaris versions 2.5.1 and 2.6 were released. But something has changed, and it wasn’t the OS. There are at least two possibilities.

First, Solaris may be falling foul of more aggressive speculative execution. Intel documents that code fetches performed shortly after updating a page table may use the previous value in the absence of a synchronizing instruction between the page table update and the code fetch. That could be happening here, although it would imply truly impressive speculative execution.

Second, Solaris could be the victim of an Intel CPU erratum (bug). The problem was observed on two very different systems with Sandy Bridge Core i7 CPUs, which both contain a TLB-related erratum. The processor specification update (largely a bug list) lists erratum BJ88,  “An Unexpected Page Fault May Occur Following the Unmapping and Re-mapping of a Page”. The symptoms match, although it’s unclear if that’s what’s truly causing the page fault.

Solaris 2.5 and earlier does not have the problem because 4MB pages aren’t supported, so there’s no need for the special identity-mapped code. Solaris 7 and later, on the other hand, buckled up and added code to read and write CR3 after updating the page tables. That causes a full TLB flush and any problems with stale TLB entries or processor errata won’t happen.

Working around the problem

When running Solaris 2.5.1 or 2.6 in a VM, there are at least two workarounds available. The first is not using nested paging. In that case, the paging behavior in the VM is very different and the TLB size is effectively much smaller. Page faults are handled on the host and only some are forwarded to the guest. However, turning off nested paging causes a performance hit (sometimes very noticeable), so it would be nice to not have to do that.

Another possible approach would be patching the Solaris kernel, but that was not explored.

The other tested workaround is convincing Solaris that it shouldn’t even try to use 4MB pages. There is little benefit from using large pages in the 2.5.1 and 2.6 Solaris releases, so giving up 4MB page support is not particularly painful.

All it takes is removing the PSE bit (Page Size Extensions) from CPUID information. To be exact, it’s bit 3 in register EDX in CPUID leaf 1. VirtualBox unfortunately doesn’t allow masking out specific bits, so one has to take the CPUID leaf from the host (with VBoxManage list hostcpuids), modify the data, and update the VM configuration with VBoxManage modifyvm --cpuidset.

For example, if EDX in CPUID leaf 1 on the host contains (hexadecimal) bfebfbff, the value needs to be modified to bfebfbf7 — that is, bit 3 must be cleared. The VM settings would then be modified with a command similar to VBoxManage modifyvm Solaris --cpuidset 1 000206a7 03100800 17bae3ff bfebfbf7.

And with that, Solaris 2.5.1 reliably comes up — hello, OpenWindows:

Solaris 2.6 no longer sulks either — hello, CDE:

As mentioned earlier, Solaris 7 does not not have this problem. The code which needs to run at an identity mapped address was made somewhat more complex and could be called more than once. That necessitated explicit TLB flushing and the problem was thus avoided.

This entry was posted in Solaris, VirtualBox, Virtualization, x86. Bookmark the permalink.

12 Responses to Solaris 2.5.1 and 2.6 crashes on modern Intel CPUs

  1. joedemo42 says:

    Wow – just wow. It’s because of articles like this, that I love this web site so much.

  2. Lochkartenstanzer says:

    great!

    I was looking for a way to get solaris 2.6/x86 running as guest in any hypervisor, but had no luck. With this hint, I hope i can save my solaris 2.6-boxes in vms, bfore their hardware disintegrates.

    Will try it soon an give feedback.

    lks

  3. AndrewGore says:

    I am running into a similar situation. I have a CNC machine that runs on a Solaris 2.6 based computer. The machines motherboard\cpu are ISA slot & continually fail. I’m in the process of building a new computer for it, with newer processor\etc. Unfortunately, I’m running into exactly what you describe here.

    If you have time, please take a look here: http://www.reddit.com/r/solaris/comments/2ur2o4/i_have_a_solaris_30sun_os_56_computer_that_i_am/

    I’ve listed all I’ve gone through (that post is where I was shown your site).

  4. Michal Necasek says:

    Unfortunately I can’t make out enough of the error message to see if you’re hitting the same problem. It’s possible that you are, and that Pentium 4 already has this issue. All I know is that modern Intel CPUs have the problem and Pentium II does not (clearly Pentium MMX, your old system, doesn’t either).

    Have you considered using an AMD-based system? That might work… Alternatively, a Pentium III (rather than Pentium 4) system could also do the trick. If the 4MB page support is really the problem, the patching the Solaris kernel would be another option. Oh, or upgrading to Solaris 7.

  5. AndrewGore says:

    Michal,
    Thank you for your response. During Boot Interpreter, when I input “b kadb -d”, I receive the same error you show in your example above. If I use “b kadb” I get the following: http://i.imgur.com/zjrK50A.png All of which as you noted & found point to the Pentium 4 Compatibility. I unfortunately am not the most fluent person in Solaris, so I’m not sure what is all involved with patching the Solaris Kernel.
    I would like to use any other platform, but my system requires me to have 3 ISA Slots. Finding a motherboard like that that isn’t 15-20 years old is difficult. I went with what I could find, unfortunately not realizing the incompatibility with Solaris 2.6.
    I’m currently attempting to install a newer Solaris version onto the computer. I’ll then have to figure out how to get the machine software off of the old system, and make it all work on the new system. That or I attempt to find an ISA Slot cpu that is more readily available to replace the unit in the existing system.
    I appreciate your help & feedback.

    Andrew

  6. Michal Necasek says:

    You’ve piqued my curiosity. I’ll have to try Solaris 2.6 on a few Pentium III/4 systems to see what works and what doesn’t… but it will take a few days.

    If it turns out that the Pentium 4 is really the problem, I’ll give some thought to patching the Solaris kernel. When you run ‘uname -a’, what exactly do you get?

  7. AndrewGore says:

    I put in “uname -a” and “b uname -a” at the only time I know of possible, during the boot interpreter:
    http://i.imgur.com/ox4uhgm.jpg

    Is there another point I should try inputting this?

    I’m still working on doing a new Solaris install in the interim.

    Andrew

  8. Michal Necasek says:

    I admire your resolve — you clearly don’t have much of a clue about Unix 🙂 The ‘uname -a’ command would have to be run from a command shell, not the boot prompt… so that won’t work if your system doesn’t boot up. But it doesn’t matter because your previous screenshot told me what I needed. You’re running Solaris 2.6 “version Generic” which means no patches. And that means I can easily check the behavior of the same OS version.

    Anyway, today I tried booting Solaris 2.6 on an Istanbul generation AMD Opteron and it worked fine. I failed to boot it up on a Core 2 system, not because of the CPU but because I couldn’t convince the system to present an IDE-compatible CD-ROM. So I don’t know if a Core 2 class CPU works or not. I did verify that Solaris 2.6 works on a 1.13 GHz Pentium III (Dell Latitude C810). Unfortunately I don’t have any Pentium 4 system at hand to try.

    But that’s all probably moot anyway, because in the screenshot I noticed that you’re definitely not running into the same problem. You’re seeing a crash in the int20 routine, probably due to corrupted stack. Right now I can’t say what causes it and if there’s some way to work around it. I need to do some digging.

    What you could try is disabling everything on your board that isn’t absolutely necessary (USB, audio, serial/parallel ports, etc.) and removing all non-essential devices.

  9. Michal Necasek says:

    So the int20 routine is intended for dropping into the debugger, and it’s probably only called if something else already went wrong. Apart from the previous suggestion (remove/disable all devices, perhaps also reduce memory size), you could try booting with ‘b kadb -v’. That could at least give some hint as to what might be going wrong.

    Oh, and I think others mentioned that, but booting with the ‘-r’ switch could perhaps help, too.

    Does the new board have a multi-core processor or hyperthreading? Disabling all but one logical processor might help too.

  10. AndrewGore says:

    Michal,
    I’ve attempted to boot the machine with every option in the BIOS settings disabled. It seemed to not make much of a difference. The new board and processor as far as I know are not multi-core or hyperthreading. I’ll have to verify 100%. I’ll also look into reducing the amount of memory. I only have the 2 – 1gb sticks of appropriate ram on hand, so I can look into trying some smaller ones.
    The real confusing thing to me is how I was able to boot everything on a different, newer computer. I’ll grab the specs on that machine on Monday. I do know when I was booting with that machine, that it had at least 4gb of ram in it and is a 64 bit machine.
    On the upside, I was able to get one of my original machines with the ISA Motherboards back up and working. Long story short, I found there was an aftermarket jumper wire on the board that was shorting out my cmos battery, resetting all my BIOS settings. I’ll continue probing into getting this other machine working as it’ll be good to have a backup plan in place.

    Andrew

  11. Michal Necasek says:

    My guess is that the reason for the crash is not the CPU but the BIOS or some other piece of hardware (‘b kadb -v’ might reveal something). The AMD system I booted it up on was from 2010 or 2011, not that old.

    It sounds like you already successfully booted it up on a system with 4GB RAM too. Now that I think about it, the above-mentioned AMD system had 8GB RAM. So that shouldn’t be an issue (although you never know, different boards can handle things differently).

    Whether a CPU has 64-bit support doesn’t matter much. A 32-bit OS like Solaris 2.6 just doesn’t care about the 64-bit extensions.

    What exactly is your Pentium 4 board? And do you know the exact CPU model?

  12. Michal,

    Great Article! It was quite helpful in getting Solaris 2.5.1 installed in VirtualBox. 🙂
    So, I figured I’d share the <a href="http://hentenaar.com/patch-for-the-solaris-251-paging-bugpatch I wrote to fix this issue in the 2.5.1 kernel. I can imagine it might apply to 2.6 also with a few modifications, but lacking a copy of the 2.6 kernel, I can’t say.

Leave a Reply

Your email address will not be published. Required fields are marked *