A blog reader recently pointed to an interesting problem which affects older Solaris releases on certain systems. The symptoms (crash/reboot) may at first glance look like the previously described problem which affected Solaris 2.5.1 and 2.6, but both the cause and the set of affected systems are different.
When systems based on the Pentium 4 started appearing in the early 2000s, users of several then-recent versions of Intel editions of Solaris discovered that Solaris could not be successfully booted (or installed) on Pentium 4 processors. The affected versions were Solaris 2.6 (1997), Solaris 7 (1998) and Solaris 8 (2000). On the other hand, Solaris 2.5.1 (1996) and older continued working; Solaris 9 (2003) was never affected.
The problem manifested itself as a “BAD TRAP” panic very early in the boot, often but not always accompanied by a triple fault/reboot. There was no easy way to avoid the problem, but there was a workaround which required a little bit of typing, and which was available thanks to the very helpful Solaris kernel debugger. Because the kernel debugger was available even on the installation media, it was entirely possible to engage the workaround, install the OS, and then patch the kernel.
The cause of the problem was somewhat careless coding on the part of Solaris kernel developers, combined with Intel’s ever-changing MSR (Model-Specific Register) implementation. Solaris 2.6 was the first to add support for Intel MCE, or Machine Check Exceptions.
Intel’s MCE, introduced in the Pentium Pro (P6 microarchitecture), was an attempt to give an OS a chance to do something about hardware errors which were serious but not necessarily immediately fatal—parity errors, ECC failures, bus errors, cache problems. The CPU would generate a Machine Check Exception (MCE), a very high priority exception that the OS could handle and at least record the event, even if it wasn’t safe to continue.
MCE itself was an extension of the earlier MCA (Machine Check Architecture) introduced in the Intel Pentium processor. MCE was essentially a more advanced superset of MCA.
As it often happens with such features, they can easily cause more problems than they solve. Solaris 2.6/7/8 contained code to set up MCEs on the P6 family of processors. If the CPU reported MCA and MCE feature bits in CPUID, Solaris would run a
setup_mca() routine early in the kernel start-up sequence (or during processor initialization for secondary processors).
The routine worked on P6 family CPUs as designed, but broke on the Pentium 4 (and later Intel CPUs) because Intel slightly changed the layout of MCE MSRs. The code in the Solaris kernel was supposed to take newer CPUs into account but due to a coding error it didn’t. On the Pentium 4, it would attempt an invalid MSR write and caused a #GP fault, which would panic the system.
The problem was of course fixed in updated Solaris releases. Solaris 8 Update 5 (officially designated as Solaris 8 7/01) was able to boot on a Pentium 4, and so did later Solaris 8 updates. For earlier releases, patch 108529-08 corrected the problem, but installing it on a Pentium 4 system of required kadb trickery as described above.
Sun’s bug numbers for the problem were 4408508, “setup_mca() has extra, faulty indirection; cases panic” and 4414557, “setup_mca: MSR definitions incorrect for Pentium 4, can’t boot”.
It’s not currently known whether any official patches were available for Solaris 2.6 or 7. Again, kadb patching worked on those releases.
This bug is one of the “no amount of testing would have caught that” category. There was simply no problem on the CPUs available at the time Solaris 2.6/7/8 was released, and only the newer Pentium 4 processor exposed the bug.
Note that the exact behavior depends on the specific CPU model and Solaris version. For example, Solaris 2.6 crashes on a Pentium 4 M while Solaris 8 FCS does not. On the other hand, Solaris 2.6 does not crash on a Core i7 (well, not because of MCE MSRs) while Solaris 8 FCS does. In both cases, the crash is caused by the
setup_mca() routine and the resolution is the same.
What to do if Solaris 8 before U5 needs to be installed on a Pentium 4 or later system, or an older Solaris version for which no patch is available needs to be moved to newer hardware?
Fortunately, the Solaris kadb debugger makes it possible to patch the kernel and avoid the crash, either on an installed system or on the installation media. It is thus possible to install the unpatched OS onto an affected system and patch it after installation. The workaround is as follows:
- On the boot prompt, enter
b kadb -d. This will load kadb and break into the debugger (the
-doption) before the kernel starts executing.
- On the kadb prompt, enter
setup_mca/w c3. This will patch a RET instruction at the beginning of the
setup_mcaroutine and prevent the buggy function from running.
:cto continue execution and boot/install the OS.
These steps need to be performed on every boot until the OS is patched. A sample invocation from Solaris 8 FCS is shown here:
Once the system is booted, it’s possible to either install an official Solaris patch or apply the workaround permanently to the installed kernel. To permanently patch the kernel, run the following at the command prompt (with superuser privileges):
echo 'setup_mca?w c3' | adb -w /platform/i86pc/kernel/unix
This is the equivalent of the kadb runtime patch (adb is the older userland sibling of kadb), except it modifies the installed OS kernel file on disk.
The problem may be visible in virtualized systems if they expose enough of the machine check exception MSRs—and if Solaris would crash were it to run on the host system directly; for example on AMD Opteron systems there is no problem.
VirtualBox 4.3 is one of such hypervisors. On the one hand, it’s nice that such a problem can be examined in a VM… on the other hand it’s a bit inconvenient that OS bugs are exposed.
The guest OS can be of course patched as above. The alternative is not exposing the MCA/MCE CPUID bits; since the guest OS will never see any machine checks, it doesn’t need to be ready to receive them. Clearing either the MCA or MCE bit should do the trick for Solaris; at least clearing only the MCA bit is known to work.
The MCA bit is bit 14 in register EDX in CPUID leaf 1. In VirtualBox, one might for example run
VBoxManage list hostcpuids to query the host’s CPUID information, clear bit 14 in the last doubleword (EDX) of leaf 1 (that’s the second leaf), and tweak the VM. Supposing the host’s EDX value is
bfebfbff, the guest needs to see
VBoxManage modifyvm MySolarisVM --cpuidset 1 000206a7 06100800 1fbae3ff bfebbbff
Et voilà, unpatched Solaris 8 boots again: