WordStar needs address wraparound?

The CP/M compatible interface in DOS was initially documented, later forgotten, and then re-discovered every once in a while.

In 1989, John Switzer described parts of the CALL 5 system call interface mechanism in a slightly hysterical article as a “back door” into DOS and called it a security risk, despite the fact that it was a compatibility interface very deliberately maintained in every version of DOS. However, the article correctly pointed out that the CALL 5 interface bypassed INT 21h hooks. In theory, that could have been used for nefarious purposes; then again, worrying about that on the DOS platform was like being gravely concerned that cold wind could get in through a small crack in the window in a house without a roof. DOS simply wasn’t a secure foundation and patching a tiny hole couldn’t fix the fact that any program running on DOS effectively owned the entire system.

On page 5 of Undocumented DOS (first edition, 1990), Andrew Schulman wrote: “For example, in DOS 1.0 Microsoft documented the fact that, in addition to using INT 21h, applications could call operating system functions with a CALL 5 instruction. This DOS holdover from CP/M was used by several then important programs, including WordStar.” That suggests the CALL 5 interface had at one point been officially documented for Microsoft’s DOS, not just SCP’s 86-DOS. It is unknown whether IBM’s DOS Technical References documented the CALL 5 interface or not.

Incidentally, the CALL 5 interface does not appear to be mentioned in the second edition of Undocumented DOS. However, the second edition is very significantly different from the first, to the point where it’s almost misleading to call it the second edition.

At any rate, WordStar is mentioned as one of the applications which actually used the CALL 5 interface. That is not so surprising, because WordStar had been originally written as a CP/M application.

In 2000, the CALL 5 interface and WordStar made another appearance, although it’s not clear whether Joel Spolsky’s article on chicken and egg problems supports the link or not. It does say that WordStar was ported to DOS with almost no changes; unfortunately, the article is so riddled with wildly inaccurate and just plain wrong claims that it’s difficult to take it very seriously. Calling XENIX an 8-bit version of UNIX is either a joke gone wrong or amazingly ignorant—when was XENIX 8-bit, and what would be the point of running an 8-bit OS on a 16-bit CPU? The article also says: “In fact, WordStar was ported [from CP/M] to DOS by changing one single byte in the code.” It’s unclear how that would have worked when DOS was designed for easy porting of CP/M applications running on the 8-bit 8080 CPU—when DOS was written, 16-bit CP/M didn’t even exist, and in fact that was the whole reason why DOS (née 86-DOS, née QDOS) had been written in the first place!

As a side note, Spolsky’s claim that at the release of the IBM PC, DOS competed against XENIX and UCSD p-System is also very wide off the mark. The three operating systems initially announced for the IBM PC were DOS, UCSD p-System, and CP/M-86 (rather than XENIX, which came much later). But more importantly, neither p-System nor CP/M-86 were available at the launch, and for more than half a year DOS had no competition whatsoever.

Other curiously inaccurate and misleading statements have been made in relation to the CALL 5 system call interface. For example, the otherwise excellent Undocumented PC by Frank van Gilluwe mentions that interrupt vectors 30h and 31h in fact contain a 5-byte far jump instruction, but then goes on to say that “the new DPMI service handler in protected mode uses interrupt 31h, which destroys the last byte of this 5 byte far jump.” That is a self-contradictory statement, since the protected mode interrupt vector table (IVT) is completely different and separate from the real-mode IVT!

Reading the DPMI specification, it is clear that INT 31h is protected-mode only; determining DPMI presence and switching to protected mode is done without any use of INT 31h. Indeed, even when a DPMI server like Quarterdeck’s QDPMI or the 386MAX DPMI server from Qualitas is installed (or for that matter, Windows 3.x in Enhanced mode), the far jump at 0:C0h is undisturbed.

But back to WordStar. According to the unofficial WordStar history, the DOS version of WordStar 3.0 (released in April 1982) was converted from CP/M with minimal modifications. It is plausible that it used the CP/M style CALL 5 system call interface. Unfortunately, without a copy of the DOS version of WordStar 3.0, this cannot be confirmed.

By version 3.3, the DOS version of WordStar had already been converted to use the INT 21h system call interface, although it was still limited to CP/M functionality—most importantly, no support for directories.

It is plausible that WordStar 3.0 may have been the one important application which forced the development of the A20 gate and the associated nonsense, but it is unlikely that it was the only reason. If anyone knows of other significant software which provably required 8086-style address wraparound, either directly or (as was likely the case for WordStar 3.0) indirectly, please leave a comment.

This entry was posted in DOS, PC architecture. Bookmark the permalink.

9 Responses to WordStar needs address wraparound?

  1. Yuhong Bao says:

    EXEPACK had a bug where it relied on address wraparound *if* a EXEPACKed program was loaded under the 64K physical address boundary, which was hidden for years until DOS 5.0 added support for loading DOS in the HMA.

  2. Yuhong Bao says:

    “In 1989, John Switzer described parts of the CALL 5 system call interface mechanism in a slightly hysterical article as a “back door” into DOS and called it a security risk”
    Reminds me of this, BTW:

  3. michaln says:

    That is true, but irrelevant. Did EXEPACK even exist before the PC/AT? As far as I can tell, it didn’t. And as you say, when it did show up, people did not realize that it in some cases relied on address wraparound. So as a reason for implementing the gate A20 hardware, I don’t see how EXEPACK matters at all (especially given the LOADFIX workaround). We’re looking here for reasons for A20 hardware which existed in 1983-1984…

    For what it’s worth, the ‘Packed file is corrupt’ problem appears with lightly configured DOS 3.0-3.2 if the A20 line is enabled. With DOS 3.3, the bare OS already uses enough memory that the problem doesn’t happen. DOS 2.x and earlier is not so relevant because that wasn’t very useful on a PC/AT.

  4. Yuhong Bao says:

    I think EXEPACK was created in the DOS 2.x era. I don’t have exact dates or copies though.

  5. michaln says:

    The oldest LINK.EXE with /EXEPACK support and the oldest EXEPACK utility I could find were both from 1985. The copyright message in the old EXEPACK says 1985, which is a strong hint that it’s not older. The oldest DOS which shipped any exepacked utilities that I could find was 4.0. I see the EXEPACK utility (and lots of exepacked files) in MASM 4.0 (1985), but nothing in MASM 3.0 (late 1984).

    Interestingly, SYMDEB 3.01 from June 1985 contains a “Can’t debug packed files” message, but there is no such message in SYMBEB 3.00 from December 1984. I simply see no evidence that EXEPACK predates the PC/AT. I’d be happy to revise my opinion if someone shows me an older EXEPACK implementation.

  6. Julien Oster says:

    Great article! One thing:

    “since the protected mode interrupt vector table (IVT) is completely different and separate from the real-mode IVT!”

    Regarding protected mode, you probably meant to write Interrupt Descriptor Table (IDT)?

  7. michaln says:

    Yes and no. Interrupt vector table is a generic term; the x86 protected-mode IDT is one possible implementation. At any rate, the point is that the DPMI service at INT 31h is not accessible from real mode and not in the real-mode IVT (DPMI only provides a few INT 2Fh services in real mode).

  8. Julien Oster says:

    Hmm. I was intuitively thinking that an “Interrupt Vector Table” applies more to a “classic” list of simple vectors, i.e. addresses, probably in a particular order or maybe in the form of very simple mapping from interrupt numbers to addresses.

    But you (and Wikipedia) are right in that the term “Interrupt Vector Table” doesn’t really constrain the table to such a simple table of vectors, as the descriptors there somehow *do* all contain a vector to something or other. Although I think that that’s almost a stretch, especially given that there could also be task gates in it…

    Well anyway, you’ve convinced me that it’s not a mistake.

  9. Pingback: Another witness against WordStar | OS/2 Museum

Leave a Reply

Your email address will not be published. Required fields are marked *