In July 1990, Microsoft released a specification for Virtual DMA Services, or VDS. This happened soon after the release of Windows 3.0, one of the first (though not the first) providers of VDS. The VDS specification was designed for writers of real-mode driver code for devices which used DMA, especially bus-master DMA.
Let’s start with some background information explaining why VDS was necessary and unavoidable.
In the days of PCs, XTs, and ATs, life was simple. In real mode, there was a given, immutable relationship between CPU addresses (16-bit segment plus 16-bit offset) and hardware visible (20-bit or 24-bit) physical addresses. Any address could be trivially converted between a segment:offset format and a physical address.
When the 386 came along, things got complicated. When paging was enabled, the CPU’s memory-management unit (MMU) inserted a layer of translation between “virtual” and physical addresses. Because paging could be (and often was) enabled in environments that ran existing real-mode code in the 386’s virtual-8086 (V86) mode, writers of V86-mode control software had to contend with a nasty problem related to DMA, caused by the fact that any existing real-mode driver software (which notably included the system’s BIOS) had no idea about paging.
The first and most obvious “problem child” was the floppy. The PC’s floppy drive subsystem uses DMA; when the BIOS programs the DMA controller, it uses the real-mode segmented 16:16 address to calculate the corresponding physical address, and uses that to program the DMA page register and the 8237 DMA controller itself.
That works perfectly well… until software tries to perform any floppy reads or writes to or from paged memory. In the simplest case of an EMM386 style memory manager (no multiple virtual DOS machines), the problem strikes when floppy reads and writes target UMBs or the EMS page frame. In both cases, DOS/BIOS would usually work with a segmented address somewhere between 640K and 1M, but the 386’s paging unit translates this address to a location somewhere in extended memory, above 1 MB.
The floppy driver code in the BIOS does not and can not have any idea about this, and sets up the transfer to an address between 640K and 1M, where there often isn’t any memory at all. Floppy reads and writes are not going to work.
DMA Controller Virtualization
For floppy access, V86 mode control software (EMM386, DesqView, Windows/386, etc.) took advantage of the ability to intercept port I/O. The V86 control software intercepts some or all accesses to the DMA controller. When it detects an attempt (by the DMA controller) to access paged memory where the real-mode address does not directly correspond to a physical address, the control software needs to do extra work. In some cases, it is possible to simply change the address programmed into the DMA controller.
In other cases it’s not. The paged memory may not be physically contiguous. That is, real-mode software might be working with a 16 KB buffer, but the buffer could be stored in five 4K pages that aren’t located next to each other in physical memory.
That’s something the PC DMA controller simply can’t deal with—it can only transfer to or from contiguous physical memory. And there are other potential problems. The memory could be above 16 MB, not addressable by the PC/AT memory controller at all. Or it might be contiguous but straddle a 64K boundary, which is another thing the standard PC DMA controller can’t handle.
In such cases, the V86 control software must allocate a separate, contiguous buffer in physical memory that is fully addressable by the PC DMA controller. DMA reads must be directed into the buffer, and subsequently copied to the “real” memory which was the target of the read. For writes, memory contents must be first copied to the DMA buffer and then written to the device.
All this can be done more or less transparently by the V86 control software, at least for standard devices like the floppy.
DMA Bus Mastering
The V86 control software is helpless in the face of bus-mastering storage or network controllers. There is no standardized hardware interface, and often no I/O ports to intercept, either. Bus mastering hardware does not utilize the PC DMA controller and has its own built in DMA controller.
While bus mastering became very common with EISA and PCI, it had been around since the PC/AT, and numerous ISA based bus-mastering controllers did exist.
This became a significant problem circa 1989. Not only were there several existing bus-mastering ISA SCSI HBAs on the market available as options (notably the Adaptec 1540 series), but major OEMs including IBM and Compaq were starting to ship high-end machines with a bus-mastering SCSI HBA (often MCA or EISA) as a standard option.
The V86 control software had no mechanism to intercept and “fix up” bus-mastering DMA transfers. Storage controllers were especially critical, because chances were high that without some kind of intervention, software like Windows/386 or Windows 3.0 in 386 Enhanced mode wouldn’t even load, let alone work.
The first workaround was to use double buffering. Some piece of software, often a disk cache, would allocate a buffer in conventional memory (below 640K) and all disk accesses were funneled through that buffer. This technique was often called double-buffering.
It was also far from ideal. Double-buffering reduced the performance of expensive, supposedly best and fastest storage controllers. And it ate precious conventional memory.
The Windows/386 Solution
Microsoft’s Windows/386 had to contend with all these problems. The optimal Win/386 solution was a native virtual driver, or VxD, which would interface with the hardware.
Due to lack of surviving documentation, it’s difficult to say exactly what services Windows/386 version 2.x offered. But we know exactly what Windows 3.0 offered when operating in 386 Enhanced mode.
Windows 3.0 came with the Virtual DMA Device aka VDMAD. This VxD virtualizes the 8237 DMA controller, but also offers several services intended to be used by drivers of bus-mastering DMA controllers.
For the worst case scenario which requires double buffering, VDMAD has a contiguous DMA buffer; this buffer can be requested using the
VDMAD_Request_Buffer and returned with
VDMAD_Release_Buffer. While the VDMAD API could handle multiple buffers, Windows 3.x in reality only had one buffer. The buffer is in memory that is not necessarily directly addressable by the callers. The
VDMAD_Copy_To_Buffer APIs take care of this.
In some cases, double buffering is not needed. The
VDMAD_Lock_DMA_Region (and the corresponding
VDMAD_Unlock_DMA_Region) API can be used if the target memory is contiguous and accessible by the DMA controller. The OS will lock the memory, which means the physical underlying memory can’t be moved or paged out until it’s unlocked again. This is obviously necessary in a multi-tasking OS, because the target memory must remain in place until a DMA transfer is completed.
In the ideal scenario, a bus-mastering DMA controller supports scatter-gather. That is, the device itself can accept a list of memory descriptors, each with a physical memory address and corresponding length. Thus a buffer can be “scattered” in physical memory and “gathered” by the controller into a single entity. DMA controllers with scatter-gather are ideally suited for operating systems using paging. With scatter-gather, there is no need for double-buffering or any other workarounds.
VDMAD_Scatter_Lock API takes the address of a memory buffer, locks its pages in memory, and fills in an “Extended DMA Descriptor Structure” (Extended DDS, or EDDS) with a list of physical addresses and lengths. The list from the EDDS is then supplied to the bus-mastering hardware. The
VDMAD_Scatter_Unlock API unlocks the buffer once the DMA transfer is completed.
When it is available, using scatter/gather does not require any additional buffers and avoids extraneous copying. It takes full advantage of bus-mastering hardware. All modern DMA controllers (storage, networking, USB, audio, etc.) use scatter-gather, and all modern operating systems offer similar functionality to lock a memory region and return a list of corresponding physical addresses and lengths.
The VDMAD VxD also offers services to disable or re-enable default translation for standard 8237 DMA channels, and a couple of other minor services.
VDS, or Virtual DMA Services
Why the long detour into the details of a Windows VxD? Because in Windows 3.x, VDS is nothing more than a relatively thin wrapper around the VDMAD APIs. In fact VDS is implemented by the VDMAD VxD (the Windows 3.1 source code is in the Windows 3.1 DDK; unfortunately the Windows 3.0 DDK has not yet been recovered).
VDS offers the following major services:
- Lock and unlock a DMA region
- Scatter/gather lock and unlock a region
- Request and release a DMA buffer
- Copy into and out of a DMA buffer
- Disable and enable DMA translation
These services obviously rather directly correspond to VDMAD APIs. VDS provides a small amount of added value though.
For example, the API to lock a DMA region can optionally rearrange memory pages to make the buffer physically contiguous, if it wasn’t already (needless to say, this may fail, and many VDS providers do not even implement this functionality). The API can likewise allocate a separate DMA buffer, optionally copy to it when locking, or optionally copy from the buffer when unlocking.
The VDS specification offers a list of possible DMA transfer scenarios, arranged from best to worst:
- Use scatter/gather. Of course, hardware must support this, and not all hardware does.
- Break up DMA requests into small pieces so that double-buffering is not required. This technique will help a lot, but won’t work in all cases (e.g. when the target buffer is not contiguous).
- Break up transfers and use the OS-provided buffer, which is at least 16K in size according to the VDS specification. This involves double-buffering and splitting larger transfers, hurting performance the most.
From the above it’s apparent that Windows 3.0 was likely the canonical VDS implementation. But it was far from the only one, and it wasn’t even the first one released. More or less any software using V86 mode and paging had to deal with the problem one way or another.
An instructive list can be found for example in the Adaptec ASW-1410 documentation, i.e. DOS drivers for the AHA-154x SCSI HBAs. The ASPI4DOS.SYS driver had the ability to provide double-buffering, with all the downsides. This was not required by newer software which provided VDS. The list included the following:
- Windows 3.0 (only relevant in 386 Enhanced mode)
- DOS 5.0 EMM386
- QEMM 5.0
- 386MAX 4.08
- Generally, protected mode software with VDS support
A similar list was offered by IBM, additionally including 386/VM.
It appears that Quarterdeck’s QEMM 5.0 may have been the first publicly available VDS implementation in January 1990. Note that QEMM 5.0 was released before Windows 3.0.
VDS was also implemented by OS/2. It wasn’t present in the initial OS/2 2.0 release but was added in OS/2 2.1.
The VDS implementation in Windows 3.0 was rather buggy, and it’s obvious that at least some of the functionality was completely untested.
For example, the functions to copy to/from a DMA buffer (VDS functions 09h/0Ah) have a coding error which causes buffer size validation to spuriously fail more often than not; that is, the functions fail because they incorrectly determine that the destination buffer is too small when it’s really not. Additionally, the function to release a DMA buffer (VDS function 04h) fails to do so unless the flag to copy out of the buffer is also set.
There was of course a bit of a chicken and egg problem. VDS was to be used with real mode device drivers, none of which were supplied by Microsoft. It is likely that some of the VDS functionality in Windows 3.0 was tested with real devices prior to the release, but certainly not all of it.
In the Adaptec ASPI4DOS.SYS case, the driver utilizes VDS and takes over the INT 13h BIOS disk service for drives controlled by the HBA’s BIOS.
Newer Adaptec HBAs, such as the AHA-154xC and later, come with a BIOS which itself uses VDS. This poses an interesting issue because the BIOS must be prepared for VDS to come and go. That is not as unlikely as it might sound; for example on a system with just HIMEM.SYS loaded, there will be no VDS. If Windows 3.x in 386 Enhanced mode is started, VDS will be present and must be used, but when Windows terminates, VDS will be gone again.
This is not much of a problem for disk drivers; VDS presence can be checked before each disk transfer and VDS will be either used or not. It’s trickier for network drivers though. If a network driver is loaded when no VDS is present, it may set up receive buffers and program the hardware accordingly. For that reason, the VDS specification strongly suggests that VDS implementations should leave existing memory (e.g. conventional memory) in place, so that already-loaded drivers continue to work.
Not Just SCSI
Documentation for old software (such as Windows 3.0) often talks about “busmastering SCSI controllers” as if it was the only class of devices affected. That was never really true, but bus-mastering SCSI HBAs were by far the most widespread class of hardware affected by the problems with paging and DMA not playing along.
By 1990, the Adaptec 154x HBAs were already well established (the AHA-1540 was available since about 1987), and Adaptec was not the only vendor of bus-mastering SCSI HBAs.
There were also bus-mastering Ethernet adapters that started appearing in 1989-1990, such as ones based on the AMD LANCE or Intel 82586 controllers. Later PCI Ethernet adapters used almost exclusively bus mastering. Their network drivers for DOS accordingly utilized VDS.
Microsoft released the initial VDS documentation in July 1990 in a self-extracting archive aptly named
VDS.EXE (as documented in KB article Q63937). After the release of Windows 3.1, Microsoft published an updated VDS specification in October 1992, cunningly disguised in a file called
PW0519T.TXT; said file was also re-published as KB article Q93469.
IBM also published VDS documentation in the PS/2 BIOS Technical Reference, without ever referring to ‘VDS’. The IBM documentation is functionally identical to Microsoft’s, although it was clearly written independently. It is likely that IBM was an early VDS user in PS/2 machines equipped with bus-mastering SCSI controllers.
Original VDS documentation is helpfully archived here, among other places.
VDS was a hidden workhorse making bus-mastering DMA devices transparently work in DOS environments. It was driven by necessity, solving a problem that was initially obscure but circa 1989 increasingly more widespread. The interface was very similar to the API of Windows 3.0 VDMAD VxD, but VDS was implemented more or less by every 386 memory manager. It was used by loadable DOS drivers but also by the ROM BIOS of post-1990 adapters.