Where Did CP852 Come From?

In the 1990s, a lot of my documents were written in code page 852 (CP852), also known as PC Latin 2. This code page is sometimes called “Eastern European”, which is a bit misleading, given that it does not cover major Eastern European countries like Ukraine; sometimes it is also called “Slavic”, which is no less misleading because it covers languages like Hungarian or Albanian that aren’t remotely Slavic.

In those days, fighting with code pages was a constant source of annoyance and pain. DOS and OS/2 used CP852, Windows used CP1250, and Unix/Linux used ISO 8859-2. Of course these code pages were all incompatible with each other. The worst problem was early web where content was often offered in some 8-bit encoding but with no hint as to which encoding that might have been (let’s play a guessing game!). It is a real shame that UTF-8 hadn’t come a bit earlier.

In the early to mid-1990s the situation was further complicated by several non-standard encodings, like the Kamenický brothers encoding in Czechoslovakia or the Mazovia encoding in Poland. Those encodings originated in the mid-1980s and tended to preserve most of the CP437 semi-graphic characters; code page 852 did not, on the other hand it covered quite a few languages. Users initially preferred the non-standard national encodings because those worked better for them, but built-in operating system support pushed those out.

And now I started wondering: When did CP852 become available to users, and where did it actually come from? The first question can be answered reasonably accurately, while the second remains unclear.

Continue reading
Posted in DOS, I18N, IBM, Microsoft, OS/2, PC history | 47 Comments

XMVM Surgery

Last week I was prompted to take a look at the Intel Code Builder compiler from 1991, a 32-bit compiler targeting 386 extended DOS and shipping with its own DOS extender. It is what one might call an extremely obscure compiler; it had to compete with established offerings from MetaWare, Watcom, or Zortech, and soon also with the compiler heavyweights Borland and Microsoft.

One of Code Builder’s very few claims to fame is that early alpha releases of id Sofware’s DOOM were built with Intel Code Builder 1.1, before id switched to Watcom compilers for the DOS releases of DOOM.

There is one poorly preserved archive of Code Builder 1.0 available. As others have noticed, it won’t even build the trivial hello world program that comes with it:

Linker error caused by missing XMVM

There should be a file called XMVM installed with Code Builder, but it’s just not there. Since it is required by default for linking of 32-bit executables, nothing works unless the /XNOVM switch is passed to the compiler/linker.

The OS/2 Museum happens to have an archive of the Intel 386/486 C Code Builder Kit v1.0 installer which clearly explains why the other copies have no XMVM file. It is in the installer archive… corrupted, and cannot be uncompressed:

XMVM is missing because installer was corrupted

The compiler can be installed, but the XMVM file will be missing.

But wait! If the XMVM module is linked into executables produced by Code Builder, perhaps there is a way to recover it from Code Builder itself?

Continue reading
Posted in 386, Development, Intel, PC history, Software Hacks | 10 Comments

Another Trip to Drive Geometry Hell

Recently I took another close look at the IDE.DSK driver in NetWare 3.12. Among other things, I wanted to know how it differs from ISADISK.DSK. On some systems, the two drivers are interchangeable and either will work. But there are also systems that only one or the other driver can handle.

The ISADISK.DSK driver should really be called ATDISK.DSK because it’s written to the PC/AT fixed disk programming interface. As such, it will work with ST506 style MFM/RLL drives attached to an AT-compatible controller. It will also work with ESDI drives attached to an AT-compatible controller (such as the WD1005A or WD1007V). And it will also work with IDE drives.

However, ISADISK.DSK will only work with up to two drives supported by the system’s BIOS. ISADISK.DSK looks at the drive type information in the CMOS and relies on the FDPT (Fixed Disk Parameter Table pointed by interrupt vectors 41h/46h) to query geometry information. Said geometry is used internally and also fed to the INITIALIZE DRIVE PARAMETERS command.

For that reason, ISADISK.DSK will work with IDE drives as long as the FDPT geometry is compatible with IDE, i.e. does not have more than 16 heads; the BIOS limit on sectors per track, 63, is lower than IDE’s 255, so that won’t be a problem.

Continue reading
Posted in IDE, NetWare, PC history | 8 Comments

Another Myth Busted

More than once I came across a story of a heroic MicroPro programmer who in an all-night session managed to port WordStar from CP/M to DOS by patching a single byte. This is how the legend was retold by Joel Spolsky:

Now, here’s a little known fact: even DOS 1.0 was designed with a CP/M backwards compatibility mode built in. Not only did it have its own spiffy new programming interface, known to hard core programmers as INT 21, but it fully supported the old CP/M programming interface. It could almost run CP/M software. In fact,  WordStar was ported to DOS by changing one single byte in the code. (Real Programmers can tell you what that byte was, I’ve long since forgotten).

Joel Spolsky,

Now, that story is slightly misleading. The “spiffy new programming interface” accessible through INT 21h pretty much was the CP/M programming interface, and it wasn’t until DOS 2.0 that the INT 21h interface was significantly enhanced.

But the gist of the story does not even make sense. Although DOS was designed to make porting from CP/M easy, it was never a question of patching a byte here or there, since CP/M ran on 8080 CPUs and DOS ran on 8086/8088 processors. The processor families are certainly related, but not at all binary compatible. 8080 assembly source code could be machine translated to 8086 source and reassembled, but the code quality was reportedly less than ideal.

And yet… there is a kernel of truth in the story, even though it morphed into something highly implausible. Not unlike there really are Wang word processor symbols in the IBM PC character set, even though the stories told by Bill Gates are very difficult to take seriously.

Continue reading
Posted in CP/M, DOS, PC history, WordStar | 5 Comments

Unidentified PC DOS 1.1 Boot Sector Junk Identified

Anyone trying to disassemble the PC DOS 1.1 boot sector soon notices that at offsets 1A3h through 1BEh there is a byte sequence that just does not belong. It appears to be a fragment of code, but it has no purpose in the boot sector and is never executed. So why is the sequence of junk bytes there, and where did it come from?

The immediate answer is “it came from FORMAT.COM”. The junk is copied verbatim from FORMAT.COM to the boot sector. But those junk bytes are not part of FORMAT.COM, either. So the question merely shifts to “why are the junk bytes in FORMAT.COM, and where did they come from?”

It is not known if anyone answered the question in the past, but the answer has been found now, almost 40 years later—twice independently.

Posted in Development, DOS, PC history | 23 Comments

First Dual-Channel IDE?

The OS/2 Museum recently came into possession of what may be the first adapter with support for two IDE channels… sort of:

Two-cable IDE adapter, 1989

The adapter was made by Plus Development Corporation, a subsidiary of the disk maker Quantum. This particular specimen was manufactured in 1989, though its BIOS has a 1988 copyright.

Continue reading
Posted in IDE, PC hardware, PC history, Quantum | 29 Comments

LAN Manager 2.1/2.2 Registration

Anyone who spent a bit of time archiving software distributed on floppies probably knows this situation: There’s only one disk set of a given software release known to exist, and it’s not clean. That is, it’s been previously used to install the software and the installer scribbled something on the floppies, like the user’s name or a serial number.

Installing a clean copy of LAN Manager 2.2

Restoring the disks to a virgin state is desirable, but that’s easier said than done. There are easy cases like the OS/2 installation boot floppy: The installer creates a file named INSTALL.LOG and writes a couple of things to it, but does not modify any existing files. Such behavior is relatively harmless because the additional file can be simply ignored, or all traces of it can be thoroughly erased.

LAN Manager 2.1 and 2.2 is in a different category. The setup program “burns in” the user name, modifying a file on one of the installation disks. That is far more difficult to undo because an existing file’s timestamp is changed and the contents modified. Restoring original content is rather difficult because it’s not at all obvious what it had originally been. Although having just one “clean” set of LAN Manager disks helps.

Continue reading
Posted in Archiving, Debugging, Microsoft, Software Hacks | 18 Comments

The Secret History of ATAPI

The other day I asked myself a seemingly trivial question: What was the first ATAPI CD-ROM drive and when was it available? Given that ATAPI was a major technology which instantly obsoleted all proprietary CD-ROM interfaces and made SCSI much less desirable, one might expect that there would have been some press releases touting the advantages of the new technology, articles describing the whys and wherefores, but… nope. There is nothing.

A rather old IDE CD-ROM drive with curious jumper settings

In 1993, CD-ROM drives used either SCSI or one of several proprietary interfaces, the major amongst those being Matsushita/Panasonic, Mitsumi, Philips, and Sony. In 1995, the proprietary interfaces were effectively gone and most new CD-ROM drives used the ATAPI interface. Something clearly happened in 1994, but exactly what, when, and how—that’s something of a mystery.

Continue reading
Posted in CD-ROM, PC history, Standards, Undocumented | 53 Comments

Looking for High Sierra

Some time ago, I thought it would be useful to understand exactly what is the difference between CD-ROMs recorded in the old High Sierra format versus the ISO 9660 standard. This was in part spurred by the fact that I have a number of CD-ROMs/images that use the High Sierra format (Microsoft Programmer’s Library, some IBM Developer Connection issues, OS/2 Warp 4, and more) that both macOS and Windows 10 refuse to mount. The other part of my motivation was the usual insatiable curiosity.

Finding the actual text of the High Sierra Working Paper (also High Sierra Proposal, i.e. proposed standard) turned out to be rather unexpectedly difficult. I found a number of articles talking about the High Sierra Proposal (HSP) but not the actual HSP text.

The closest thing I could find was an article in the excellent PC Tech Journal in the July 1987 issue (Patterning CD-ROM by Peter Jansson, page 163). Said article recaps the HSP in very good detail but it’s not the actual text. But even that was enough to show that although the structure defined in the High Sierra format is not far from the ISO 9660 standard, the two data structures are just different enough to be mutually incompatible.

Continue reading
Posted in CD-ROM, Documentation, PC history | 23 Comments

Deeper Into ATA History

While looking for something completely unrelated (namely the Rock Ridge extensions to ISO 9660), I came across a cache of old X3T9 committee documents from 1990. In retrospect I’m a little surprised that I hadn’t found these earlier, since the archive appears to have been published on one of Walnut Creek CD-ROMs circa 1994, but I’m not sure how long it’s been online.

What’s interesting is that the Walnut Creek archive appears to overlap with the X3T9.2 archive that has available for a long time, but contains numerous documents that the X3T9.2 archive does not. Notably there’s a directory with CAM Committee documents. While the CAM Committee’s primary objective was to define a Common Access Method (CAM) for software accessing SCSI devices, an effort that ultimately went nowhere, the CAM Committee also started a rather more successful side project, the AT Attachment (ATA) standard.

The archive is far from complete, but it does include one complete ATA draft, revision 2.1 from June 11, 1990. That’s one revision older than the oldest ATA draft I was aware of until now, which is revision 2.2 from August 1990. The rev 2.1 draft is provided as a ZIP archive containing WordStar files, which is excellent for seeing exactly how the draft was edited (and the WordStar files include a couple of editorial comments that do not show up in the printed version), but the downside is that getting from WordStar to PDF was not entirely trivial. In the end I was able to produce a PDF of ATA Rev 2.1 in 2-up format that’s quite similar to the scanned documents in the X3T9.2 archive.

Even better, the Walnut Creek archive includes what appears to be the very first and quite incomplete ATA standard draft from March 30, 1989. Said draft also provides a hint why a SCSI oriented committee started ATA in the first place: The early ATA drafts also included a specification of EATA (Extended AT Attachment), a SCSI pass through mode of ATA devices (completely separate from and much older than ATAPI).

Sadly the initial draft—which is so old that it’s called DAD, for Disk ATBus Definition, rather than ATA—does not include the EATA sections. In the next oldest currently available draft (revision 2.1), EATA had been already removed again. ATA revisions 1.x appear to have included the SCSI pass through functionality defined by EATA.

EATA was the brainchild of DPT (Distributed Processing Technology), one of the larger SCSI HBA vendors. An overview of EATA can be found here. I don’t believe anyone besides DPT implemented EATA, but the idea behind it was quite interesting.

CHM’s oral history of Dal Allan describes how EATA was created by DPT and desired by Quantum, but WD successfully fought to remove it from the standard for cost reasons, only to implement the same idea (SCSI pass through over ATA) as ATAPI a couple of years later.

The first ATA draft from March 1989 notably already defines the IDENTIFY DRIVE command as well as READ/WRITE MULTIPLE, but there is no sign of DMA support yet. The DASP signal for letting drive 0 detect drive 1 was also already defined, although the details were refined many times since then.

Finding the very first ATA draft is something I doubted would ever happen. Now I wonder if the revision 1.x ATA drafts might eventually turn up, too.

The list of early ATA drafts on this site has now been appropriately updated.

Posted in IDE, PC history, Standards | 17 Comments