Solaris 2.5.1 and 2.6 crashes on modern Intel CPUs

I recently found that Solaris 2.6 and 2.5.1 does not work when run in a VM on a modern Intel CPU (Sandy Bridge generation Core i7), or to be exact fails most of the time (about nine times out of ten) when nested paging is used. The symptom is Solaris hanging or rebooting immediately after the kernel is loaded, even before the kernel banner is printed. When nested paging (or hardware virtualization) isn’t used, there’s no problem.

After managing to boot Solaris 2.6 on a physical Core i7 system (not so easy, since a boot floppy plus CD is required!), it turned out that the exact same thing happens, and this is thus not a virtualization issue. But what’s going on there? And why would nested paging fail when the older, slower virtualization methods work? Thanks to the debugging capabilities built into Solaris, it’s possible to answer most of those questions.

One of the nice things about Solaris is that it has had a usable kernel debugger for a long time, and unlike most operating systems, the kernel debugger is always there and in fact available even on the installation CD. That makes it possible to debug a problem like this without even installing Solaris.

All one needs to do is enter b kadb -d on the boot prompt. That will cause kadb (the kernel debugger, kernel adb) to be loaded; the -d switch causes kadb to stop at the earliest opportunity, so that the user has a chance to set breakpoints etc.

It should be noted that kadb is derived from adb, and thus uses syntax which can be only described as “interesting”.

Cause of the crash

The crash occurs very early during system initialization. The first thing the Solaris kernel does is detect the CPU type, and if support for 4MB pages is available (Pentium/Pentium Pro class CPUs), that support is enabled. This is where things go wrong.

The Solaris kernel hits a page fault, which is intercepted and handled by the kernel. Unfortunately, the fault handler triggers another exception, perhaps because the kernel isn’t really set up yet, causing a cascade of faults. The stack eventually overflows and ends up overwriting the GDT, which is stored in memory just below the stack. That is fatal, because the trap handling code reloads segment registers a lot, and with a corrupted GDT, an exception cannot be dispatched. OS death ensues.

The ultimate cause is the first trap in the cycle. To enable 4MB pages, Solaris needs to modify the CR4 register, which must be done with paging disabled. To that end, Solaris creates an identity mapping (physical address equals virtual address) for a single page which holds the routine modifying CR4. The routine turns off paging in CR0, updates CR4, turns paging on again, and returns to the caller. The (indirect) call to the routine causes the crash.

A bit of work with kadb and knowledge of the x86 architecture makes it possible to determine that the indirect call instruction causes a page fault because the destination page is not present. Yet using the debugger to examine the supposedly non-present page shows that it’s very much there. It’s also safe assume that when Solaris 2.6 was released, it did not crash like that. What’s going on there?

Changed semantics and living dangerously

The true cause of the crash is not easy to determine. What is known is that Solaris 2.6 happily works on Pentium II class machines, as well as AMD Phenom CPUs. It may be only the Intel Core i5/i7 CPUs it has trouble with.

It’s fairly clear that one strong contributing factor to the crash is that Solaris updates the page tables but does not invalidate the TLB (Translation Lookaside Buffer) for the updated page. That’s living dangerously, but by itself shouldn’t cause problems. If an existing page mapping were changed without invalidating the TLB, that would be asking for trouble. However, in this case a previously non-present page is made present, and in that case the CPU can’t store anything in the TLB. Therefore when the page is referenced, the CPU should traverse the page tables in memory and find the expected mapping.

That’s clearly how things worked back in 1996-1997 when Solaris versions 2.5.1 and 2.6 were released. But something has changed, and it wasn’t the OS. There are at least two possibilities.

First, Solaris may be falling foul of more aggressive speculative execution. Intel documents that code fetches performed shortly after updating a page table may use the previous value in the absence of a synchronizing instruction between the page table update and the code fetch. That could be happening here, although it would imply truly impressive speculative execution.

Second, Solaris could be the victim of an Intel CPU erratum (bug). The problem was observed on two very different systems with Sandy Bridge Core i7 CPUs, which both contain a TLB-related erratum. The processor specification update (largely a bug list) lists erratum BJ88,  “An Unexpected Page Fault May Occur Following the Unmapping and Re-mapping of a Page”. The symptoms match, although it’s unclear if that’s what’s truly causing the page fault.

Solaris 2.5 and earlier does not have the problem because 4MB pages aren’t supported, so there’s no need for the special identity-mapped code. Solaris 7 and later, on the other hand, buckled up and added code to read and write CR3 after updating the page tables. That causes a full TLB flush and any problems with stale TLB entries or processor errata won’t happen.

Working around the problem

When running Solaris 2.5.1 or 2.6 in a VM, there are at least two workarounds available. The first is not using nested paging. In that case, the paging behavior in the VM is very different and the TLB size is effectively much smaller. Page faults are handled on the host and only some are forwarded to the guest. However, turning off nested paging causes a performance hit (sometimes very noticeable), so it would be nice to not have to do that.

Another possible approach would be patching the Solaris kernel, but that was not explored.

The other tested workaround is convincing Solaris that it shouldn’t even try to use 4MB pages. There is little benefit from using large pages in the 2.5.1 and 2.6 Solaris releases, so giving up 4MB page support is not particularly painful.

All it takes is removing the PSE bit (Page Size Extensions) from CPUID information. To be exact, it’s bit 3 in register EDX in CPUID leaf 1. VirtualBox unfortunately doesn’t allow masking out specific bits, so one has to take the CPUID leaf from the host (with VBoxManage list hostcpuids), modify the data, and update the VM configuration with VBoxManage modifyvm --cpuidset.

For example, if EDX in CPUID leaf 1 on the host contains (hexadecimal) bfebfbff, the value needs to be modified to bfebfbf7 — that is, bit 3 must be cleared. The VM settings would then be modified with a command similar to VBoxManage modifyvm Solaris --cpuidset 1 000206a7 03100800 17bae3ff bfebfbf7.

And with that, Solaris 2.5.1 reliably comes up — hello, OpenWindows:

Solaris 2.6 no longer sulks either — hello, CDE:

As mentioned earlier, Solaris 7 does not not have this problem. The code which needs to run at an identity mapped address was made somewhat more complex and could be called more than once. That necessitated explicit TLB flushing and the problem was thus avoided.

This entry was posted in Solaris, VirtualBox, Virtualization, x86. Bookmark the permalink.

27 Responses to Solaris 2.5.1 and 2.6 crashes on modern Intel CPUs

  1. joedemo42 says:

    Wow – just wow. It’s because of articles like this, that I love this web site so much.

  2. Lochkartenstanzer says:

    great!

    I was looking for a way to get solaris 2.6/x86 running as guest in any hypervisor, but had no luck. With this hint, I hope i can save my solaris 2.6-boxes in vms, bfore their hardware disintegrates.

    Will try it soon an give feedback.

    lks

  3. AndrewGore says:

    I am running into a similar situation. I have a CNC machine that runs on a Solaris 2.6 based computer. The machines motherboard\cpu are ISA slot & continually fail. I’m in the process of building a new computer for it, with newer processor\etc. Unfortunately, I’m running into exactly what you describe here.

    If you have time, please take a look here: http://www.reddit.com/r/solaris/comments/2ur2o4/i_have_a_solaris_30sun_os_56_computer_that_i_am/

    I’ve listed all I’ve gone through (that post is where I was shown your site).

  4. Michal Necasek says:

    Unfortunately I can’t make out enough of the error message to see if you’re hitting the same problem. It’s possible that you are, and that Pentium 4 already has this issue. All I know is that modern Intel CPUs have the problem and Pentium II does not (clearly Pentium MMX, your old system, doesn’t either).

    Have you considered using an AMD-based system? That might work… Alternatively, a Pentium III (rather than Pentium 4) system could also do the trick. If the 4MB page support is really the problem, the patching the Solaris kernel would be another option. Oh, or upgrading to Solaris 7.

  5. AndrewGore says:

    Michal,
    Thank you for your response. During Boot Interpreter, when I input “b kadb -d”, I receive the same error you show in your example above. If I use “b kadb” I get the following: http://i.imgur.com/zjrK50A.png All of which as you noted & found point to the Pentium 4 Compatibility. I unfortunately am not the most fluent person in Solaris, so I’m not sure what is all involved with patching the Solaris Kernel.
    I would like to use any other platform, but my system requires me to have 3 ISA Slots. Finding a motherboard like that that isn’t 15-20 years old is difficult. I went with what I could find, unfortunately not realizing the incompatibility with Solaris 2.6.
    I’m currently attempting to install a newer Solaris version onto the computer. I’ll then have to figure out how to get the machine software off of the old system, and make it all work on the new system. That or I attempt to find an ISA Slot cpu that is more readily available to replace the unit in the existing system.
    I appreciate your help & feedback.

    Andrew

  6. Michal Necasek says:

    You’ve piqued my curiosity. I’ll have to try Solaris 2.6 on a few Pentium III/4 systems to see what works and what doesn’t… but it will take a few days.

    If it turns out that the Pentium 4 is really the problem, I’ll give some thought to patching the Solaris kernel. When you run ‘uname -a’, what exactly do you get?

  7. AndrewGore says:

    I put in “uname -a” and “b uname -a” at the only time I know of possible, during the boot interpreter:
    http://i.imgur.com/ox4uhgm.jpg

    Is there another point I should try inputting this?

    I’m still working on doing a new Solaris install in the interim.

    Andrew

  8. Michal Necasek says:

    I admire your resolve — you clearly don’t have much of a clue about Unix 🙂 The ‘uname -a’ command would have to be run from a command shell, not the boot prompt… so that won’t work if your system doesn’t boot up. But it doesn’t matter because your previous screenshot told me what I needed. You’re running Solaris 2.6 “version Generic” which means no patches. And that means I can easily check the behavior of the same OS version.

    Anyway, today I tried booting Solaris 2.6 on an Istanbul generation AMD Opteron and it worked fine. I failed to boot it up on a Core 2 system, not because of the CPU but because I couldn’t convince the system to present an IDE-compatible CD-ROM. So I don’t know if a Core 2 class CPU works or not. I did verify that Solaris 2.6 works on a 1.13 GHz Pentium III (Dell Latitude C810). Unfortunately I don’t have any Pentium 4 system at hand to try.

    But that’s all probably moot anyway, because in the screenshot I noticed that you’re definitely not running into the same problem. You’re seeing a crash in the int20 routine, probably due to corrupted stack. Right now I can’t say what causes it and if there’s some way to work around it. I need to do some digging.

    What you could try is disabling everything on your board that isn’t absolutely necessary (USB, audio, serial/parallel ports, etc.) and removing all non-essential devices.

  9. Michal Necasek says:

    So the int20 routine is intended for dropping into the debugger, and it’s probably only called if something else already went wrong. Apart from the previous suggestion (remove/disable all devices, perhaps also reduce memory size), you could try booting with ‘b kadb -v’. That could at least give some hint as to what might be going wrong.

    Oh, and I think others mentioned that, but booting with the ‘-r’ switch could perhaps help, too.

    Does the new board have a multi-core processor or hyperthreading? Disabling all but one logical processor might help too.

  10. AndrewGore says:

    Michal,
    I’ve attempted to boot the machine with every option in the BIOS settings disabled. It seemed to not make much of a difference. The new board and processor as far as I know are not multi-core or hyperthreading. I’ll have to verify 100%. I’ll also look into reducing the amount of memory. I only have the 2 – 1gb sticks of appropriate ram on hand, so I can look into trying some smaller ones.
    The real confusing thing to me is how I was able to boot everything on a different, newer computer. I’ll grab the specs on that machine on Monday. I do know when I was booting with that machine, that it had at least 4gb of ram in it and is a 64 bit machine.
    On the upside, I was able to get one of my original machines with the ISA Motherboards back up and working. Long story short, I found there was an aftermarket jumper wire on the board that was shorting out my cmos battery, resetting all my BIOS settings. I’ll continue probing into getting this other machine working as it’ll be good to have a backup plan in place.

    Andrew

  11. Michal Necasek says:

    My guess is that the reason for the crash is not the CPU but the BIOS or some other piece of hardware (‘b kadb -v’ might reveal something). The AMD system I booted it up on was from 2010 or 2011, not that old.

    It sounds like you already successfully booted it up on a system with 4GB RAM too. Now that I think about it, the above-mentioned AMD system had 8GB RAM. So that shouldn’t be an issue (although you never know, different boards can handle things differently).

    Whether a CPU has 64-bit support doesn’t matter much. A 32-bit OS like Solaris 2.6 just doesn’t care about the 64-bit extensions.

    What exactly is your Pentium 4 board? And do you know the exact CPU model?

  12. Michal,

    Great Article! It was quite helpful in getting Solaris 2.5.1 installed in VirtualBox. 🙂
    So, I figured I’d share the <a href="http://hentenaar.com/patch-for-the-solaris-251-paging-bugpatch I wrote to fix this issue in the 2.5.1 kernel. I can imagine it might apply to 2.6 also with a few modifications, but lacking a copy of the 2.6 kernel, I can’t say.

  13. Stephan says:

    Srange. How you get X11 up and running in 1024*768 resolution with PseudoColor? With the drivers included in the VirutalBox Guest Additions? Or do you use an alternative X11-Server?

    I managed to get Solaris 2.4 running within VMware Fusion 4 but without X11.
    I tried with to install Solaris 2.4 with VirtualBox first but it didn’t work. How does your VBox configuration looks like?

    Cheers, Stephan

  14. Michal Necasek says:

    That’s a really good question. And the answer is “neither”. There are no drivers for Solaris 2.x included in the VirtualBox Guest Additions, and I only used what came with Solaris 2.x. I did however add a file /usr/openwin/share/etc/devdata/svpmi/SUNWvga8/vbox-8.pmi which I created myself. It adds support for the VirtualBox SVGA emulated chip to the SUNWvga8 server. It may have required some manual hacking which I forgot in the meantime. If you want the PMI file, let me know.

  15. Stephan says:

    Thanks! Thats the answer I’d looking for. I did this for the CL GD-5428 PMI file to extend the config from 512k to 1024k but I don’t think about the built-in SUNWvga8 files. I will take a look first.

    Still have another problem. With Fusion 4 (VMware) I was able to get Solaris 2.4 installed. But if I try to start ‘kdmconfig -cv’ inside the guest the program terminats, of course w/o error messages, don’t remember exit code, have to check again.

    I would appreciate if you could assist with the VBox config instead, just because I prefere VB instead of VMware.

  16. Michal Necasek says:

    The basic VM config I have for Solaris 2.4 is pretty simple… 64MB RAM, 500MB IDE hard disk, IDE CD-ROM, floppy, no audio, PCnet-FAST III network chip, enabled COM1 port (just to avoid some warnings), no USB. I believe ATAPI CD-ROM support requires updated Solaris 2.4 boot disks. It should also be possible to use BusLogic SCSI.

  17. Stephan says:

    Ok. I got it up and running with VBox 4.3.20 on a 2GiB IDE drive (VDI image) with 32MiB RAM. I choose to install it over network from my install server.

    Then I tried to install the VBox guest additions for Solaris und 2.4. The SVR4 package stream containing the vbox guest additions for Solaris (VBoxSolarisAdditions.pkg) is, of course, incompatible with versions prior to Solaris 8. I have access to the package but I think it will make no difference, because of the internal symbols, lib versions, package dependencies (e.g. SUNWuiu8) and missing tools (e.g. isainfo).
    It was quite easy to get access to the package contents and make them available under Solaris 2.4:
    (jumper := my install server w/ Solaris 10)
    jumper# pkgadd -s /root/vboxguestaddsol -d ./VBoxSolarisAdditions.pkg

    The following packages are available:
    1 SUNWvboxguest Oracle VM VirtualBox Guest Additions
    (i386) 4.3.20,REV=r96996.2014.11.21.14.59

    Select package(s) you wish to process (or ‘all’ to process
    all packages). (default: all) [?,??,q]: 1
    Transferring package instance
    # cd vboxguestaddsol
    # tar cvf vboxguestaddsol.tar SUNWvboxguest
    a SUNWvboxguest/ 0K
    a SUNWvboxguest/pkginfo 1K
    a SUNWvboxguest/pkgmap 8K
    a SUNWvboxguest/install/ 0K
    a SUNWvboxguest/install/depend 1K
    a SUNWvboxguest/install/postinstall 15K
    a SUNWvboxguest/install/space 1K
    a SUNWvboxguest/install/preremove 3K
    a SUNWvboxguest/reloc/ 0K
    (…)

    FTP transfer to VM sol24vbox

    # cd SUNWvboxguest
    # ls -l
    total 22
    drwxr-xr-x 2 101 staff 512 Apr 1 03:21 install
    -rw-r–r– 1 101 staff 457 Nov 21 2014 pkginfo
    -rw-r–r– 1 101 staff 7779 Nov 21 2014 pkgmap
    drwxr-xr-x 5 101 staff 512 Apr 1 03:21 reloc

    # /usr/ccs/bin/nm /root/SUNWvboxguest/reloc/usr/kernel/drv/vboxguest
    Symbols from vboxguest:
    (…)

    # pwd
    /root/SUNWvboxguest/reloc/opt/VirtualBoxAdditions
    # ls -l
    total 114
    -rwxr-xr-x 1 101 staff 1339 Nov 21 2014 1099.vboxclient
    -rw-r–r– 1 101 staff 20516 Nov 21 2014 LICENSE
    -rwxr-xr-x 1 101 staff 1547 Nov 21 2014 VBox.sh
    -rwxr-xr-x 1 101 staff 10924 Nov 21 2014 VBoxClient
    drwxr-xr-x 2 101 staff 1024 Apr 1 03:21 amd64
    drwxr-xr-x 2 101 staff 1024 Apr 1 03:21 i386
    -rw-r–r– 1 101 staff 2946 Nov 21 2014 solaris_xorg.conf
    -rw-r–r– 1 101 staff 2073 Nov 21 2014 solaris_xorg_modeless.conf
    -rw-r–r– 1 101 staff 371 Nov 21 2014 vboxclient.desktop
    -rwxr-xr-x 1 101 staff 5708 Nov 21 2014 vboxguest.sh
    -rwxr-xr-x 1 101 staff 2487 Nov 21 2014 x11config15sol.pl
    -rwxr-xr-x 1 101 staff 2180 Nov 21 2014 x11restore.pl

    FYI: Solaris 2.4 x86 has no problem, if running in VirtualBox on MBP8,2 under Mac OS X Lion 10.7.5 (x86_64) on Core i7 … until now. But I will get another problem with one patch from the Y2K cluster

    If still possible, I would like to come back to your offer for your PMI file. 😉

    Thanks, Stephan

  18. r.stricklin says:

    I’ve recently been puzzling over remarkably similar behavior with Solaris 2.4 and 2.5 on an ALR Evolution IVe with a 486 DX2/66.

    Turned out to be the CPU revision is too new! The SX955 DX2 has a write-back cache mode that isn’t present on earlier S-spec 486s. Use any older rev DX2 in the machine, Solaris boots and runs. I’m guessing that Solaris detects this write-back cache capability and tries to enable it very early, and that the Evolution IVe doesn’t support it (as evidenced by the machine locking up during POST with a P24T installed).

    Lots of other OSes run fine with the SX955 DX2 installed in this machine: DOS, OS/2 1.30 or 2.1, Warp 3, Coherent 4.0 or 4.1, Interactive 4.1, NEXTSTEP 3.1 or 3.3, NT 3.1 or 3.5, Plan 9 4th ed, 386BSD… even Solaris 2.1. But not Solaris 2.4, 2.5, or 2.6.

  19. Michal Necasek says:

    Unless I’m misremembering (always possible), the 486 WB cache is not something software enables, it’s some pin strapping option. The CPU even comes up with a different CPUID value. It is possible the board enables it when it shouldn’t, and most older software won’t provoke failures. Solaris 2.5 definitely tries to do 486 cache management (enabling, disabling, flushing) which might expose problems that the older OSes won’t.

  20. r.stricklin says:

    Looks like you’re right, it needs logic high on WB/WT (B13) to enable WB mode if the CPU supports it. Perhaps then the problem is that Solaris is able to detect the feature is present in the CPU and blithely assumes it must work. Obviously I don’t actually know; I’m just guessing. The official pinout indicates B13 is internal no connection on non-WB 486s, so it seems unlikely that Solaris 2.5 could be causing the motherboard to do anything unexpected with that pin.

  21. Michal Necasek says:

    I don’t think there’s any explicit indication in the CPU as to how the cache behaves. The only way I know of is checking the CPUID value, which is different in WB/WT modes (see here). And software can’t enable or disable it. There are instructions which behave differently, like WBINVD (in WT mode, there’s nothing to write back).

    The difference could also be caused by the fact that the SX955 S-spec supports CPUID, and the older ones likely don’t. That may cause Solaris to take a different path somewhere. Have you tried using kadb to see if Solaris is crashing, and if so, where?

  22. Michal Necasek says:

    Turns out I had a SX955 CPU in junk my pile. You can definitely tell if it uses WB cache or not based on the CPUID. If it’s 0436 then the cache runs in WT mode, if it’s 0470 the cache is in WB mode. The CPU supports the CPUID instructions so it should not be too difficult to find out. Mine runs in WT mode on the board I tried it in (OPTi 82C499 chipset), shows 0436 TFMS signature.

  23. astm says:

    “That’s a really good question. And the answer is “neither”. There are no drivers for Solaris 2.x included in the VirtualBox Guest Additions, and I only used what came with Solaris 2.x. I did however add a file /usr/openwin/share/etc/devdata/svpmi/SUNWvga8/vbox-8.pmi which I created myself. It adds support for the VirtualBox SVGA emulated chip to the SUNWvga8 server. It may have required some manual hacking which I forgot in the meantime. If you want the PMI file, let me know.”

    “If still possible, I would like to come back to your offer for your PMI file. ”

    I also would please you for your PMI File for Solaris 2.x 🙂

  24. Wolfgang says:

    Michal,

    unfortunately Solaris 2.6 crashes on contemporary Intel CPUs too if a micro channel system with more than 64MB RAM is used.

    Thanks to this inspiring blog I made an attempt to boot Solaris on an IBM 9595 (Pentium90) with 128MB RAM. The operating system was installed successful with 64MB RAM .

    As expected, the kernel crashes after adding additional 64MB system RAM. At next boot I’ve tried ‘b kadb -d’ and continued with ‘:c’. The system loads the network driver and finds the local filesystems. Here are the final screen messages before the kernel exits to the kadb console:

    […]
    The system is coming up. Please wait.
    checking ufs filesystems
    /dev/rdsk/c0d0s1: is stable
    /dev/rdsk/c0d1s7: is stable
    /dev/rdsk/c0d0s5: is stable
    /dev/rdsk/c0d0s7: is stable
    BAD TRAP
    sh: Page Fault
    Kernel fault at addr=0x0, pte=0x0
    pid=125, pc=0x0, sp=0xf5d63ed9, eflags=0x10202

    eip(0), eflags(10202), ebp(e0c7b960), uesp(f5d63ed9), esp(e0c7b92c)
    eax(f5d09468), ebx(87000), ecx(727b8), edx(1000), esi(f6195318), edi(f54468c8)
    cr0(8005003b), cr2(0), cr3(9c1000)
    cs(158) ds(160) ss(db28) es(160) fs(1a8) gs(1b0)
    panic: Page Fault
    stopped at:
    int20+0xb: ret
    kadb[0]

    I’m not sure if this information gives much more than a page fault reference. However this system state is replicable and further investigations are possible. I do not expect to get a solution in the form of a kernel patch here. Delimiting the underlying problem alone would be great.

    Thanks

    Wolfgang

  25. Michal Necasek says:

    That looks like a call through a null pointer (‘eip(0’). Would be interesting to know where it was called from, if the information is there.

  26. Michal Necasek says:

    For Solaris 2.4, a file called vbox-8.pmi needs to be placed in the /usr/openwin/share/etc/devdata/svpmi/SUNWvga8 directory. Now, I did this ten years ago, so I’m really fuzzy on the details. I’m pretty sure running kdmconfig was somehow involved. It should pick up the VirtualBox graphics device and offer it in the list.

    This should work up to and including Solaris 2.6 (i.e. 2.4, 2.5, 2.5.1, 2.6). For Solaris 7 and/or 8 I believe at minimum the files needed to be placed in a different directory. More later.

  27. Sergey Temerkhanov says:

    On VMWare it is possible to mask out CPUID bits. For the PSE bit in question the following config line works:
    cpuid.1.edx=”—-:—-:—-:—-:—-:—-:—-:0—“

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.