Although WordStar was long suspected to be the reason (or at least one of the major reasons) for implementing the A20 gate hardware on the PC/AT and all the associated problems later on, it is now all but certain that that wasn’t the case.
To recap, the earliest versions of WordStar for the IBM PC were 3.02 (probably April or May 1982), and 3.20 (likely Summer 1982). Whatever version 3.02 did or didn’t do, it was not compatible with PC DOS 1.1 or later, and thus could not have been relevant when the PC/AT was being designed. WordStar 3.20 has now been examined and found not to use the CALL 5 system call interface or do anything else that would cause problems on the PC/AT. WordStar 3.2x did use the word at offset 6 in the PSP to query the available memory, but not the call at offset 5.
Then it turned out that a crucial piece of evidence has been hiding in (almost) plain sight all along. Richard Wells highlighted U.S. Patent 4,779,187, “Method and operating system for executing programs in a multi-mode microprocessor” by Gordon Letwin. The filing date of the patent is April 10, 1985, less than a year after the IBM PC/AT was introduced, when these sorts of problems would have been in fresh memory.
The patent contains the following text: Some programs written for the 8086 rely on [address wrap-around] to run properly. Unfortunately, memory locations extend above 1 megabyte in the real mode of the 80286 and are not wrapped to low memory locations. Consequently, programs including those written in MicroSoft PASCAL and programs which use the “Call 5 ” feature of MS-DOS will fail on the standard 80286 system.
Microsoft Pascal, huh? Two paragraphs later, Pascal is mentioned again, explaining how one might work around the problems: For example, no PASCAL programs are loaded into memory below 64K, and a special instruction is placed in the lower memory locations above 1 megabyte–for example, address 100000h or 100010h.
So… Pascal programs might have trouble when loaded below 64K? What does that have to do with the A20 line? A lot, it turns out.
Too Clever By Half
A Pascal compiler (IBM Pascal 1.0, supplied by Microsoft) was part of the first batch of software packages available when the IBM PC was announced in 1981. It was also used to build commercial software, including Microsoft/IBM MASM and the Pascal compiler itself.
The early versions of MS/IBM Pascal used a memory model which might be called “mostly small”, with separate code and data, and possible optional far code segments. Heap, stack, and data were all located in a single physical data segment (DGROUP, naturally up to 64K).
There were certain implementation details which can only be described as “baroque”. MS Pascal had a heap growing from the bottom and a stack growing from the top. Interestingly, the stack size did not have a fixed limit, and as long as there was space in the middle, both the stack and the heap could grow. So far so good.
The problem was that statically allocated data and constants were placed at the top of the data segment, rather than at the bottom. The Pascal runtime start-up code tried to use up to 64K of memory for the data segment, and copied the static data from wherever they were loaded as part of the EXE image into their final location. The layout was very helpfully illustrated in the IBM Pascal manual (August 1981) on page 2-32:
Because the source and destination might overlap, the copy had to be done in reverse direction (from high to low addresses). Because the data to be copied was always at the top of the data segment, the copying (REP MOVSW) started at offset 65534 and continued downward.
So what happened if the Pascal-compiled executable was loaded such that the end of the data to be copied was below 64K? Why, of course, the segment register was “negative” and relied on address wrap-around to access the data!
This caused one avoidable and one unavoidable problem. The copying could have been rewritten such that the segment register would point at the lowest data location to be copied and the offset would be adjusted accordingly. It would only have made the start-up code slightly more complicated.
A worse problem was that if there wasn’t enough memory in the system (and remember, the IBM PC was available in configurations with less than 64K RAM total!), the bottom of the data segment would still be “below zero” and DS had to be “negative” and rely on address wrap-around. That would have been much more difficult to solve because it would need applying additional relocations to code and data, and the Pascal run-time was not equipped to deal with that.
That is exactly why the Letwin patent says that the problem could be avoided by not loading Pascal programs below 64K. The minimal PC/AT configuration was 256K RAM, so it would have been theoretically doable—the difficulty would have been in detecting such programs.
And this is also likely why later language run-times were designed such that any static data were placed at the bottom of the data segment, with heap/stack above (rather than below) static data. Then there is no need to copy static data at load time and no need to potentially address DGROUP such that it requires address wrap-around.
To be absolutely clear, relying on address wrap-around was not some kind of a bug in the Pascal run-time, it was entirely intentional. How do we know that? Because Microsoft/IBM were kind enough to supply the start-up source code. The comments are unambiguous: DX is final DS (may be negative), and final DS value (may be negative).
When Is a Bug a Bug?
The address wrap-around exploitation together with an unrelated signed comparison bug raises an interesting philosophical question. Is software buggy when it fails in an environment that it was not written for, not tested with, and which didn’t even exist when the software was written?
If the answer is “yes”, then arguably all software ever written is buggy, including the simplest hello world programs. It is always possible to change the environment in ways the software never anticipated.
If the answer is “no”, then we must accept that Microsoft Pascal was not buggy but merely odd. In 1981, it used an artifact of the 8086 architecture without any ability to predict that it would go away in 1982 (when the 80286 was introduced). Likewise it used a signed comparison for memory size which failed on systems with more than 512K RAM… at a time when a beefy IBM PC had 128K RAM.
The A20 Gate
Thanks to the patent, we know that Microsoft/IBM Pascal was a notable concern for the address wrap-around and the A20 gating logic it necessitated. On a 386, it might be possible to run the CPU in Virtual-8086 mode and use paging to simulate the wrap-around (which was in fact done in some contexts; more on that later). But that was no help with the 286-based PC/AT, which predated the 386 in any case.
In a way the Pascal run-time created the worst possible problem for the PC/AT designers. Not only were known commercial applications written in Pascal affected, but also an unknown and unknowable number of user-written applications built with the compiler. The problem was also not confined to one clearly delineated interface (CALL 5) but fairly random code which relied on wrap-around to address arbitrary data.
While a new version of DOS might have solved the issue by forcing Pascal applications to be loaded above 64K (as the patent suggested), that is much easier said than done. By 1984, there were already several variants of the Pascal start-up code in the wild, and that was only considering official IBM/MS products. The reason the start-up source code was distributed was so that users could modify it, which meant that there was an unknown and unknowable number of modified variants of the start-up code out there. And even if all such code could be detected, software workarounds would still not have helped with any existing bootable floppies using pre-PC/AT versions of DOS (i.e. DOS 1.x/2.x).
In the end, implementing the A20 gate was the only safe choice. Initially it caused very little trouble because it was always turned off prior to booting an OS (as long as it was really turned off). The problems started a few years later, when DOS extenders came on the scene (circa 1988) and faced the unpleasant reality that there was no BIOS interface to control the A20 gate—and some not-so-PC-compatibles implemented A20 hardware controls which worked nothing like the IBM PC/AT. The problems were exacerbated in DOS 5.0 days (1991) when core DOS could be loaded into high memory and the A20 gate control really mattered.
It’s clear that the company most responsible for all the A20 gate trouble was Microsoft, although the ultimate decision no doubt lay with IBM. The actual person responsible for the Pascal start-up code may have been Bob Wallace, one of the first Microsoft employees and later the author of PC Write (written in Pascal); that is only speculation though.
In the 1990s, the most common troublemaker related to the A20 gate was EXEPACK (another Microsoft tool), or rather DOS executables packed with EXEPACK. However, it has now been established beyond reasonable doubt that—ironically—EXEPACK was created well after the A20 gate already was in place on the PC/AT.
More about EXEPACK in a follow-up post, and more about all the pain the A20 gate caused years later in another follow-up post.