It is well known that virtualization of the x86 architecture is an old idea. The Intel 386 processor (1985) introduced the “Virtual 8086” (V86) mode, enabling users to run real-mode operating systems as a task within a 32-bit protected-mode operating systems.
A more complete virtualization of the x86 architecture which includes 16-bit and 32-bit protected mode is likewise relatively old. One of the better known products which provided full x86 virtualization on x86 systems, VMware Workstation, dates back to 1999. Emulation on other architectures is even older, such as Virtual PC for the PowerPC Mac (1997).
On x86 architecture hosts, virtualization had to contend with numerous “holes” in the 32-bit x86 instruction set which made virtualization difficult and/or slow. Some of the more common issues are the POPF instruction which may quietly fail to update the interrupt flag, or the SMSW instruction which lets the guest operating system see the true state of control bits without allowing the hypervisor to trap it. To overcome these and other issues, Intel designed the VT-x (also known as VMX) extension to the x86 architecture, and AMD developed its own AMD-V hardware virtualization support. Specifications for VT-x and AMD-V were only published in 2005 and 2006, respectively; it took several more years for x86/x64 CPUs with hardware virtualization support to become mainstream. Yet the idea of complete x86 hardware virtualization is much, much older—by more than 15 years!
The advantages of virtualization were obvious to anyone familiar with IBM’s VM/370 system. Older operating systems and applications could be kept running on upgraded hardware, while new operating systems could be incrementally tested in a production environment. Best of all, multiple operating systems could run on the a host system at the same time.
With the 386, Intel finally had a processor architecture comparable to mainframe CPUs. Unfortunately hardware support for virtualization was restricted to 8086 systems, as noted above. Virtualizing a 32-bit protected-mode system was not practical.
One of the nastier issues was caused by segmentation and GDT/LDT (Global/Local Descriptor Table) usage. A hypervisor could not let a guest operating system manage its descriptor tables (because the guest could overwrite the hypervisor’s memory), yet instructions to store descriptor register values could not be trapped. A guest OS could therefore read true descriptor register values, quite possibly not what it had written.
These issues could be avoided by code scanning and patching, but at the cost of high complexity and a significant performance loss. The overhead would likely have made virtualization unattractive.
386 Hardware Virtualization
A solution to this problem was proposed in the May/June 1988 issue of Programmer’s Journal on page 46, in an article by Kevin Smith called simply “Virtualizing the 386”. Yes, that’s 1988—before the 486, before Windows 3.0, before even DOS 4.0.
The solution was in many ways similar to what Intel implemented in VT-x nearly 20 years later. At the same time it’s also much simpler, primarily because the 386 was a far simpler CPU than the Pentium 4 class processors which first supported VT-x.
Smith suggested “protected normal” and “protected VM” processor modes, much like the root and non-root VMX operation in VT-x. Rather than creating completely new data structures, Smith’s design simply extended the existing TSSs (Task State Segments) to store additional information.
A hypervisor would then be a “super-task” which could switch to the guest context through a far jump to a task gate (pointing to the guest’s TSS), a mechanism which had been introduced in the 286. Certain events would then cause the guest OS to switch back to the hypervisor task, again via a TSS switch. That would include control register accesses, external interrupts, or execution of the HLT instruction.
Paging would be handled strictly on the hypervisor side through “shadow paging”, a technique commonly used by modern hypervisors in the absence of nested paging in hardware.
Many of Smith’s suggestions are eerily reminiscent of VT-x and AMD-V, such as storing the length of the current (faulting) instruction when switching to the hypervisor, or optional interception of software interrupts.
One additional feature which Intel and AMD did not implement is a TRA (Translate Real Address) instruction which would translate a virtual address to physical using a page table directory potentially different from the current one.
It is difficult to guess what the computing landscape might look like today if Intel had implemented hardware virtualization back in 1989-1990. It might be very significantly different, or perhaps not.
Hardware virtualization would probably have had a significant impact on the OS wars of the 1990s, and one has to wonder if Microsoft would have been as successful in an environment where Windows could be easily virtualized and used alongside a different OS without the need to dual boot.
It is quite possible that hardware virtualization support would have been very beneficial to Intel. It is conceivable that Intel could have become the “owner” of the PC industry; Microsoft might have still become incredibly successful with Windows, but Intel could have replaced IBM as the steward of the PC hardware platform.
For reasons that may be lost to history, Intel did not implement hardware virtualization until much, much later. In the early 2000s, processors were fast enough and code analysis and dynamic translation technologies were advanced enough that virtualization of x86 on x86 became practical. Once the business importance of virtualization was blindingly obvious, Intel implemented a hardware virtualization technology which was in principle extremely similar to what Kevin Smith had suggested back in 1988.