KEYBCS2

After writing about the likely origins of IBM code page 852, I thought I should revisit the homegrown Czech alternative solution, the Kamenický brothers encoding and their keyboard driver. Its existence is well documented, and the so-called (somewhat misnamed) KEYBCS2 encoding even has its own Wikipedia article. The encoding itself lives on in various conversion tables, and utilities to convert text to or from the Kamenický encoding are easy enough to locate. Sometimes the encoding is also called MJK—the initials of its authors, Marian and Jiří Kamenický.

But finding the actual KEYBCS2 utility turned out to be ridiculously difficult. I scoured the Internet for it. I could not find it. At all. I found a fair amount of text talking about it, but not the actual utility.

In desperation, I started searching my NAS. I must have had the utility in the early 1990s, but after I switched to primarily using OS/2 in the mid-1990s, the DOS keyboard driver wasn’t all that useful, and OS/2 had its own reasonably well functioning support using CP852 (compatible with the built-in DOS support).

After much searching, I found an archive with KEYBCS2.EXE dated 07/27/90 on my NAS. Sadly, all my attempts to run it ended up in failure:

What is this nonsense?!

Obviously I was not trying to debug the program, but I was forced to do so.

Continue reading
Posted in DOS, I18N, IBM, x86 | 29 Comments

Where Did CP852 Come From?

In the 1990s, a lot of my documents were written in code page 852 (CP852), also known as PC Latin 2. This code page is sometimes called “Eastern European”, which is a bit misleading, given that it does not cover major Eastern European countries like Ukraine; sometimes it is also called “Slavic”, which is no less misleading because it covers languages like Hungarian or Albanian that aren’t remotely Slavic.

In those days, fighting with code pages was a constant source of annoyance and pain. DOS and OS/2 used CP852, Windows used CP1250, and Unix/Linux used ISO 8859-2. Of course these code pages were all incompatible with each other. The worst problem was early web where content was often offered in some 8-bit encoding but with no hint as to which encoding that might have been (let’s play a guessing game!). It is a real shame that UTF-8 hadn’t come a bit earlier.

In the early to mid-1990s the situation was further complicated by several non-standard encodings, like the Kamenický brothers encoding in Czechoslovakia or the Mazovia encoding in Poland. Those encodings originated in the mid-1980s and tended to preserve most of the CP437 semi-graphic characters; code page 852 did not, on the other hand it covered quite a few languages. Users initially preferred the non-standard national encodings because those worked better for them, but built-in operating system support pushed those out.

And now I started wondering: When did CP852 become available to users, and where did it actually come from? The first question can be answered reasonably accurately, while the second remains unclear.

Continue reading
Posted in DOS, I18N, IBM, Microsoft, OS/2, PC history | 47 Comments

XMVM Surgery

Last week I was prompted to take a look at the Intel Code Builder compiler from 1991, a 32-bit compiler targeting 386 extended DOS and shipping with its own DOS extender. It is what one might call an extremely obscure compiler; it had to compete with established offerings from MetaWare, Watcom, or Zortech, and soon also with the compiler heavyweights Borland and Microsoft.

One of Code Builder’s very few claims to fame is that early alpha releases of id Sofware’s DOOM were built with Intel Code Builder 1.1, before id switched to Watcom compilers for the DOS releases of DOOM.

There is one poorly preserved archive of Code Builder 1.0 available. As others have noticed, it won’t even build the trivial hello world program that comes with it:

Linker error caused by missing XMVM

There should be a file called XMVM installed with Code Builder, but it’s just not there. Since it is required by default for linking of 32-bit executables, nothing works unless the /XNOVM switch is passed to the compiler/linker.

The OS/2 Museum happens to have an archive of the Intel 386/486 C Code Builder Kit v1.0 installer which clearly explains why the other copies have no XMVM file. It is in the installer archive… corrupted, and cannot be uncompressed:

XMVM is missing because installer was corrupted

The compiler can be installed, but the XMVM file will be missing.

But wait! If the XMVM module is linked into executables produced by Code Builder, perhaps there is a way to recover it from Code Builder itself?

Continue reading
Posted in 386, Development, Intel, PC history, Software Hacks | 10 Comments

Another Trip to Drive Geometry Hell

Recently I took another close look at the IDE.DSK driver in NetWare 3.12. Among other things, I wanted to know how it differs from ISADISK.DSK. On some systems, the two drivers are interchangeable and either will work. But there are also systems that only one or the other driver can handle.

The ISADISK.DSK driver should really be called ATDISK.DSK because it’s written to the PC/AT fixed disk programming interface. As such, it will work with ST506 style MFM/RLL drives attached to an AT-compatible controller. It will also work with ESDI drives attached to an AT-compatible controller (such as the WD1005A or WD1007V). And it will also work with IDE drives.

However, ISADISK.DSK will only work with up to two drives supported by the system’s BIOS. ISADISK.DSK looks at the drive type information in the CMOS and relies on the FDPT (Fixed Disk Parameter Table pointed by interrupt vectors 41h/46h) to query geometry information. Said geometry is used internally and also fed to the INITIALIZE DRIVE PARAMETERS command.

For that reason, ISADISK.DSK will work with IDE drives as long as the FDPT geometry is compatible with IDE, i.e. does not have more than 16 heads; the BIOS limit on sectors per track, 63, is lower than IDE’s 255, so that won’t be a problem.

Continue reading
Posted in IDE, NetWare, PC history | 8 Comments

Another Myth Busted

More than once I came across a story of a heroic MicroPro programmer who in an all-night session managed to port WordStar from CP/M to DOS by patching a single byte. This is how the legend was retold by Joel Spolsky:

Now, here’s a little known fact: even DOS 1.0 was designed with a CP/M backwards compatibility mode built in. Not only did it have its own spiffy new programming interface, known to hard core programmers as INT 21, but it fully supported the old CP/M programming interface. It could almost run CP/M software. In fact,  WordStar was ported to DOS by changing one single byte in the code. (Real Programmers can tell you what that byte was, I’ve long since forgotten).

Joel Spolsky,

Now, that story is slightly misleading. The “spiffy new programming interface” accessible through INT 21h pretty much was the CP/M programming interface, and it wasn’t until DOS 2.0 that the INT 21h interface was significantly enhanced.

But the gist of the story does not even make sense. Although DOS was designed to make porting from CP/M easy, it was never a question of patching a byte here or there, since CP/M ran on 8080 CPUs and DOS ran on 8086/8088 processors. The processor families are certainly related, but not at all binary compatible. 8080 assembly source code could be machine translated to 8086 source and reassembled, but the code quality was reportedly less than ideal.

And yet… there is a kernel of truth in the story, even though it morphed into something highly implausible. Not unlike there really are Wang word processor symbols in the IBM PC character set, even though the stories told by Bill Gates are very difficult to take seriously.

Continue reading
Posted in CP/M, DOS, PC history, WordStar | 5 Comments

Unidentified PC DOS 1.1 Boot Sector Junk Identified

Anyone trying to disassemble the PC DOS 1.1 boot sector soon notices that at offsets 1A3h through 1BEh there is a byte sequence that just does not belong. It appears to be a fragment of code, but it has no purpose in the boot sector and is never executed. So why is the sequence of junk bytes there, and where did it come from?

The immediate answer is “it came from FORMAT.COM”. The junk is copied verbatim from FORMAT.COM to the boot sector. But those junk bytes are not part of FORMAT.COM, either. So the question merely shifts to “why are the junk bytes in FORMAT.COM, and where did they come from?”

It is not known if anyone answered the question in the past, but the answer has been found now, almost 40 years later—twice independently.

Continue reading
Posted in Development, DOS, PC history | 24 Comments

First Dual-Channel IDE?

The OS/2 Museum recently came into possession of what may be the first adapter with support for two IDE channels… sort of:

Two-cable IDE adapter, 1989

The adapter was made by Plus Development Corporation, a subsidiary of the disk maker Quantum. This particular specimen was manufactured in 1989, though its BIOS has a 1988 copyright.

Continue reading
Posted in IDE, PC hardware, PC history, Quantum | 29 Comments

LAN Manager 2.1/2.2 Registration

Anyone who spent a bit of time archiving software distributed on floppies probably knows this situation: There’s only one disk set of a given software release known to exist, and it’s not clean. That is, it’s been previously used to install the software and the installer scribbled something on the floppies, like the user’s name or a serial number.

Installing a clean copy of LAN Manager 2.2

Restoring the disks to a virgin state is desirable, but that’s easier said than done. There are easy cases like the OS/2 installation boot floppy: The installer creates a file named INSTALL.LOG and writes a couple of things to it, but does not modify any existing files. Such behavior is relatively harmless because the additional file can be simply ignored, or all traces of it can be thoroughly erased.

LAN Manager 2.1 and 2.2 is in a different category. The setup program “burns in” the user name, modifying a file on one of the installation disks. That is far more difficult to undo because an existing file’s timestamp is changed and the contents modified. Restoring original content is rather difficult because it’s not at all obvious what it had originally been. Although having just one “clean” set of LAN Manager disks helps.

Continue reading
Posted in Archiving, Debugging, Microsoft, Software Hacks | 18 Comments

The Secret History of ATAPI

The other day I asked myself a seemingly trivial question: What was the first ATAPI CD-ROM drive and when was it available? Given that ATAPI was a major technology which instantly obsoleted all proprietary CD-ROM interfaces and made SCSI much less desirable, one might expect that there would have been some press releases touting the advantages of the new technology, articles describing the whys and wherefores, but… nope. There is nothing.

A rather old IDE CD-ROM drive with curious jumper settings

In 1993, CD-ROM drives used either SCSI or one of several proprietary interfaces, the major amongst those being Matsushita/Panasonic, Mitsumi, Philips, and Sony. In 1995, the proprietary interfaces were effectively gone and most new CD-ROM drives used the ATAPI interface. Something clearly happened in 1994, but exactly what, when, and how—that’s something of a mystery.

Continue reading
Posted in CD-ROM, PC history, Standards, Undocumented | 59 Comments

Looking for High Sierra

Some time ago, I thought it would be useful to understand exactly what is the difference between CD-ROMs recorded in the old High Sierra format versus the ISO 9660 standard. This was in part spurred by the fact that I have a number of CD-ROMs/images that use the High Sierra format (Microsoft Programmer’s Library, some IBM Developer Connection issues, OS/2 Warp 4, and more) that both macOS and Windows 10 refuse to mount. The other part of my motivation was the usual insatiable curiosity.

Finding the actual text of the High Sierra Working Paper (also High Sierra Proposal, i.e. proposed standard) turned out to be rather unexpectedly difficult. I found a number of articles talking about the High Sierra Proposal (HSP) but not the actual HSP text.

The closest thing I could find was an article in the excellent PC Tech Journal in the July 1987 issue (Patterning CD-ROM by Peter Jansson, page 163). Said article recaps the HSP in very good detail but it’s not the actual text. But even that was enough to show that although the structure defined in the High Sierra format is not far from the ISO 9660 standard, the two data structures are just different enough to be mutually incompatible.

Continue reading
Posted in CD-ROM, Documentation, PC history | 23 Comments