While working on an unrelated problem, I noticed a strange behavior of one of my OS/2 VMs running OS/2 Warp 4.52. To cut a long story short, if an unhandled floating-point exception occurred in a DOS window (VDM, or Virtual DOS Machine) while executing real-mode code, the DOS box would crash because it would try to execute an invalid instruction. The IRQ13 (INT 75h) handler provided by the BIOS would run, and then execute an INT 02h instruction (for compatibility with old PCs). But interrupt vector 2 pointed to a place in the middle of the VDM’s conventional memory that wasn’t even allocated. It should be pointing to the BIOS and execute harmless code.
I quickly realized that only one of several very similarly configured VMs did this. At first I suspected a problem with the way the DOS box in the troublesome VM was set up, but that wasn’t the case. The OS version wasn’t it either. What’s more, the VDM was simply reflecting the contents of physical memory (the former real mode IVT at physical address zero)… and the IVT was modified very early in the boot, before even showing the OS/2 boot logo or boot menu.
Finally realization dawned: The OS/2 kernel debugger was doing this, and that’s why most of my VMs didn’t have the problem. But why would this happen at all?
The behavior likely exists in every 32-bit OS/2 version; it was verified to be present in OS/2 2.11 (1993), OS/2 Warp 3 (1994), and OS/2 Warp 4.52 refresh (2002). Now let’s make it clear that this is a bug that is very difficult to trigger. The OS/2 kernel debugger overwrites the first six interrupt vectors. For the most part, the default BIOS handlers do nothing, which means that software which might generate these interrupts will install its own handlers and avoid this issue entirely.
The interrupts include division by zero, single-step interrupt, breakpoint interrupt, and overflow interrupt. The NMI (vector 2) is the odd one out because the BIOS actually does contain a handler. On the other hand, an OS/2 VDM won’t get any NMIs… except that the IRQ13/INT 75h handler should invoke the NMI service routine in software.
The problem shows a few things about the OS/2 MVDM (Multiple Virtual DOS Machine) operation. DOS boxes in OS/2 partially get to execute the system’s actual BIOS, and the host system’s real IVT is used to initialize the IVT in the DOS boxes. The VDM environment isn’t entirely divorced from the underlying hardware/firmware, which has both advantages and disadvantages.
But back to the question of why this happens at all. In the days of OS/2 1.x, the OS would frequently switch between real and protected mode. The kernel debugger had both protected- and real-mode parts, and it took over several real-mode interrupts. While the DOS box was executing, the kernel debugger was in fact present the whole time.
In OS/2 2.0, the situation is different. The kernel debugger still initializes a real-mode part and installs handlers for the first six interrupt vectors. This is useful while the real-mode portion of the OS startup sequence is executing. But soon enough, the OS switches to protected mode and stays there—any real-mode code is then executed within a V86 task.
By now, the cause of the problem is probably obvious: The OS/2 kernel debugger takes over the first six real-mode interrupts, but the vectors are never restored. When a VDM starts, its real-mode IVT will contain several vectors pointing to where the real-mode kernel debugger used to be back when the OS was still in real mode, but the vectors are no longer valid. If the interrupt handlers are, against all odds, somehow invoked, the VDM will almost certainly very quickly crash.
This appears to be a very old and very obscure bug. It only shows up on OS/2 systems with the kernel debugger installed, and that is a tiny minority. Even on those systems, it is quite improbable that the bug will be triggered—most likely only as a consequence of another bug when interrupts that shouldn’t be invoked somehow are. It’s no wonder that the bug went undetected. It would be presumably easy to fix by restoring the original IVT contents either before the system switches to protected mode for good, or at least by restoring the original IVT contents for newly created VDMs.