The infamous A20 gate is well known and documented in hair-raising detail. What’s much less well documented is the real purpose of the A20 gate, that is, who actually needs the 8086 address wrap-around to be emulated in the first place.
There is precious little information on that topic. But as it turns out, all versions of DOS, including the OS/2 DOS box and NTVDM in Windows NT, implement a CP/M-like system call interface which cannot work without address wrap-around being emulated one way or another. In other words, any DOS application may in fact need the wrap-around, without necessarily being even aware of the fact. The key is the so-called ‘CALL 5’ compatibility interface, invoked by a near call to the offset 5 of the PSP (Program Segment Prefix). This interface was intended for COM programs, but will work as long the CS register points to the PSP regardless of the program type. So what’s at offset 5 of the PSP?
First let’s review the official documentation. The MS-DOS Programmer’s Reference for version 5 (from 1991) says that the field at offset 5 of the PSP “[c]ontains a long call to the MS-DOS function-request handler. This is provided for compatibility with earlier versions of MS-DOS.” That’s of course a bit of corporate doublespeak, since the desired compatibility is primarily with CP/M, which DOS was trying to approximate. There is no mention that the data at offset 6 might have any meaning. There’s also no mention of the fact that the interface is slightly different, with function number being passed in CL rather than AH.
The earlier MS-DOS Programmer’s Reference editions for version 4 and 3.3 (both from 1988) mark the bytes at offset 5-9 of the PSP as reserved, without any explanation whatsoever as to their purpose.
However, Ray Duncan’s Advanced MS-DOS Programming (also published in 1988) not only mentions that offset 5 of the PSP contains a call to the DOS function dispatcher, but explains that this exists for compatibility with the CALL 5 system call interface of CP/M. That seems a bit schizophrenic since Advanced MS-DOS Programming was published by Microsoft Press just like the official DOS references.
What do unofficial books say? The otherwise excellent Undocumented DOS (2nd edition, 1993) explains the CALL 5 compatibility interface and mentions a “rather cryptically coded far jump to the dispatcher area of MS-DOS itself”. There’s no explanation why the address is cryptically coded, and no mention that the jump is rather indirect.
So far we have official admission that the CALL 5 compatibility interface exists, but no explanation what that has to do with address wrapping. For that, one must go much further back. The 86-DOS Programmer’s Manual from 1980, which Tim Paterson graciously made available, explains the details of the interface. Function number (for functions 36 decimal or less) is placed in CL and a call to location 5 is made. The manual also states: “This form is provided to simplify translation of 8080/Z80 programs into 8086 code, and is not recommended for new programs.” That was before PC DOS or MS-DOS even existed, yet the interface is still around several decades later…
On page 17 of the 86-DOS manual, there’s another part of the puzzle. Offset 5 in the PSP is, not surprisingly, documented as the alternate function request entry point. But more importantly, offset 6-7 is documented as: “Memory size. This is the number of bytes available in the program segment.” This is another CP/M compatibility feature, and the core of the problem. If offset 5 contains a call or jump, how can offset 6 contain some data item?
On CP/M, the word of offset 6 pointed to the top of program memory, and right there was the system call entry point. Offset 5 in the program segment contained a jump instruction and offset 6 was simply the offset used by the jump, serving double duty as a data item.
On 86-DOS, that arrangement would have been extremely inconvenient. Because of the larger address space of the 8086 (the CP/M machines that 86-DOS was trying to be compatible with were limited to 64KB address space), storing system data past the program segment would mean fragmenting the available memory. 86-DOS, and hence PC DOS/MS-DOS, used a clever trick. The byte at offset 5 of the PSP contained a far call opcode (9Ah); the word at offset 6 of the PSP contained the appropriate value to indicate program segment size, and also the offset part of the far call. The word at offset 8, which served as the segment part of the far call, was crafted such that when combined with the offset, it would wrap around (a well understood feature of the 8086 CPU) and point to address 0:c0h, which contains interrupt vector 30h. That is why the address wrap-around is needed.
The typical code at offset 5 of the PSP is CALL F01D:FEF0, where FEF0h is the program segment size (65,264 bytes). Applying the usual 8086 segment arithmetic, the call points to linear address 1000C0h, which is C0h after the wrap-around. If the program segment were smaller, the segment portion of the far call would have be modified to give the same end result.
Interrupt vector 30h is in fact not a vector at all; together with the first byte of interrupt vector 31h, it is a five-byte far jump instruction pointing to the CP/M compatible system call dispatcher. That fact is documented at least in The Undocumented PC and Ralf Brown’s Interrupt List. The CP/M compatible dispatcher adjusts the stack frame, moves the CL register contents to AH, and continues as the standard INT 21h DOS system call dispatcher.
OK, that explains why address wrap-around and the A20 gate may be needed. But at the beginning it was mentioned that the CALL 5 interface works even in DOS emulation under Windows NT and OS/2, and those systems most certainly cannot run with the A20 line disabled. How does that work then? It’s actually very simple. Rather than chopping off address bits, the system mirrors the five bytes at 0:C0h at 1000C0h. The same technique had been in fact used in DOS 5 and above running with DOS=HIGH. In that case, DOS makes sure that linear address 1000C0h contains the appropriate far call.
A problem with the compatibility interface occurs when the loaded program has in fact less than 64KB available. If that happens, the word at PSP offset 6 may not contain the correct value, but the CALL 5 interface will still work; the instruction at offset 5 will be CALL 0:C0h, making the reported program segment size C0h. It is unclear why DOS does that; it appears to be a bug in DOS 5.0 and later, as DOS 4.0 and earlier versions simply adjust the segment portion so that it wraps around to 0:C0h. That works as long as the program segment size is paragraph aligned, and it will be.
Note: Some references (including Undocumented DOS) incorrectly state that the CALL 5 interface is broken in DOS version 2.0 and later. It’s not. The misconception most likely stems from a minor bug in the DEBUG utility(?) shipped with those versions of DOS. When DEBUG is started without loading a program, the far call at PSP offset 5 is indeed incorrect and points two bytes too low. However, that applies only if DEBUG is run without loading a program—a convenient way to explore the DOS environment, but not the same as running an actual program.
How to check for CALL 5 interface compatibility? The CALL 5 Demo contains the source code and binary of a simple program which uses the CALL 5 interface to invoke DOS calls rather than the typical INT 21h instructions. This utility runs on all versions of DOS, from PC DOS 1.0 to PC DOS 2000. It also runs in a NTVDM session or in an OS/2 DOS box. The typical output is:
Hello, DOS! Or is that CP/M? CALL 5 destination: F01D:FEF0 Program segment size (hex): FEF0 Memory size in paragraphs (hex): 9FFF Stack top (hex): FFFE Return address (hex): 0000
What does it mean? The CALL 5 interface transfers control to address F01D:FEF0, or linear address 1000C0, which either wraps around to 0:C0 or contains the correct jump instruction. If not, the program will most likely hang. The program segment size is the same as the offset of the CALL 5 destination and indicates 65,264 bytes of memory in the program segment. Memory size is slightly less than 640KB.
The stack top is just below the end of the program segment, and the single word already on the stack contains zero. Which means that a near return would transfer to offset 0 in the PSP, which contains an INT 20h instruction, which will then terminate the program. The CALL 5 demo uses a CALL 0 instead to terminate, which is an alternative way of doing the exact same thing.
It should be underscored that the CALL 5 demo is a tiny model .COM program which does not explicitly use segmentation at all, yet it implicitly relies on 8086-style address wrap-around. Especially on DOS 4.0 and earlier, it will only work if the system either has a real 8088/8086 CPU or emulates address wrap-around by turning off the A20 line (as any PC compatible should).
It remains to be seen which—if any—significant applications actually use the CP/M compatible system call interface.
Update: As one might expect, the CALL5 demo still works just fine under the 32-bit version of Windows 8 Developer Preview. And DEBUG.COM still shows the same bug with the CALL5 vector in the PSP being two bytes off.