The Danger of Knowing Too Much

A few days ago I had to look a little closer at Microsoft’s KEYB.COM because it was misbehaving in a virtualized environment. As a reminder for those readers who perhaps forgot, KEYB.COM was the DOS keyboard “driver” with support for international keyboard layouts. The MS-DOS version of KEYB.COM was incorrectly detecting the keyboard type (thought an 84-key variant was attached, when it was a 101/102-key of course), but IBM’s KEYB.COM from PC DOS 7 had no problem, and the FreeDOS version likewise worked fine. What I found in KEYB.COM was a little surprising.

The authors of KEYB.COM clearly had a lot of  inside knowledge about PC hardware. It’s quite likely that IBM had a hand in writing the utility, but Microsoft probably maintained it and made sure it worked (more or less) on non-IBM systems.

KEYB.COM first tries to detect the machine type, primarily by looking at the model byte near the end of the BIOS ROM. The original PC and the PC/XT are handled specially, because those used a different keyboard type. Some PS/2 models are also detected, especially the non-Microchannel kind.

When KEYB.COM determines that the system looks like a PC/AT or a “true” PS/2, it attempts to read the ID of the keyboard (command F2h is sent to the keyboard). If nothing comes back, the original 84-key AT keyboard is assumed. If the keyboard does respond, there are several options. The standard layout (ID 83ABh) is detected, as is the compact layout (84ABh). Several Japanese keyboard layouts are also detected. Provisions are made for systems that have the scan code translation enabled in the KBC (which applies to the keyboard ID as well!) or disabled.

All that is not so unusual. Likewise not too unusual is the fact that KEYB.COM installs a custom keyboard interrupt (IRQ 1) handler. What’s certainly unusual is that this keyboard handler is suspiciously similar to IBM’s ROM BIOS handler. Not identical, but very clearly derived from the same source code. Interestingly, KEYB.COM does not replace the other half of the BIOS keyboard service, INT 16h. The system’s INT 16h implementation had better be compatible with the IBM-style keyboard interrupt handler.

It gets more interesting. Rather than using the BIOS to implement short delays, KEYB.COM reads the refresh bit from system port B (at port 61h), expecting it to change at a constant rate. That seems entirely unnecessary (since a PC/AT class system is required for the refresh bit polling anyway) and rather questionable. The mechanism was documented, but programmers hardly had any right to expect that it would exist in all future models. There’s a reason why the IBM programming documentation warns programmers to use BIOS interfaces where possible rather than using the hardware directly.

Unfortunately, it only goes downhill from there. Once KEYB.COM obtained a keyboard ID, it determines if an EBDA (Extended BIOS Data Area) segment is present (INT 15h, function C1h). If it is, KEYB.COM proceeds to mercilessly write the ID word to offset 3Bh in the EBDA. IBM never documented the exact contents of the EBDA and explicitly warned that using “magic” fixed offsets is bad. To be sure, KEYB.COM doesn’t quite follow the documented procedure either; it calls INT 15h, function C1h to get the EBDA segment without first checking with INT 15h, function C0h (Get System Configuration) whether the EBDA is supported or not.

The crowning achievement of KEYB.COM is the way it detects support for extended INT 16h functions. A normal programmer would try calling the functions; for example, functions 12h and 22h could easily be called to check whether they’re implemented or not. The author of KEYB.COM instead decided to use a truly ingenuous method: try executing INT 16h with function 92h and check the contents of register AH in order to determine whether functions 10h-12h are supported.

Now, INT 16h function 92h was never documented, and doesn’t exist in any BIOS I’ve ever seen, although in theory it might. For that reason, no one should call it because there’s no telling what it could do. Alas, the KEYB.COM author read the IBM BIOS listings and noticed that INT 16h does not contain a jump table, but rather decrements AH several times and jumps to a specific sub-function if AH reaches zero. Thus an old BIOS with no extended function support would return with (most likely) 8Fh in AH when called with 92h in AH. An extended BIOS on the other hand would return with 7Fh or less in AH. This has been noticed and documented.

KEYB.COM uses the same trick to detect support for 122-key keyboard functions by calling INT 16h with A2h in AH. Again, the resulting AH value is used to determine whether the BIOS supports those functions or not.

The extended INT 16h detection method used by KEYB.COM manages to be amazingly clever and incredibly stupid both at the same time. It relies on an undocumented side effect of calling an undefined function and effectively prevents such function from being ever defined. There is simply no justification for writing such code, especially when newer BIOSes support INT 16h function 09h which returns the sought-for information in a documented way.

KEYB.COM is a nice example of poorly written DOS software (in this case, DOS itself) which exploits poorly documented or entirely undocumented aspects of operation of PC compatibles, blithely ignoring much of the advice given to programmers in official documentation. Such software forced system designers to implement a layer of “legacy” cruft which could not be changed and only existed to appease such software, making the PC platform a nightmare to work with.

This entry was posted in DOS, PC architecture. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.