Is it so hard to document things?

A few weeks ago I spent a bit of time debugging a program which mysteriously failed under DOS 3.3, although it worked without any apparent problem on DOS 4.0 and later, and there was no indication that it required anything later than DOS 3.1 or so.

The problem turned out to be related to INT 21h, function 6300h, a curiously underdocumented DOS API. The call was first implemented in the Far East versions of MS-DOS 2.25 and returned a table of DBCS lead bytes. That is crucial information for software running on DBCS systems, required for correctly parsing pathnames etc. The RBIL also documents the call for Far East versions of DOS 3.2 and later. So what’s the problem?

The problem is that the call only exists in Far East version of DOS 3.x. But that’s not the whole story. Microsoft’s official documentation (e.g. the MS-DOS Encyclopedia) documents the INT 21h, function 6300h API to be only available in MS-DOS 2.25. The call is not documented in any of the official references for MS-DOS 3.3, 4.0, 5.0, or PC DOS 7.0.

So how are applications supposed to get the information? In DOS 3.3, the internationalization support was redesigned (by IBM?) and function 65h was used to return country-related information. However, perhaps because IBM did not support Japanese or other DBCS languages at the time, there was no documented way to obtain the DBCS lead byte table. In DOS 4.0, sub-function 7 was added to INT 21h, function 65h to return the table, in almost the same format as the one previously returned by INT 21h, function 6300h.

The official MS-DOS documentation does not clearly say whether INT 21h, function 65h, sub-function 7 is supported on US and other non-DBCS versions of DOS, but it is supported and returns an empty lead byte table.

What the official documentation doesn’t say is that DOS 4.0 and later also supports the old INT 21h, function 6300h API, in all national language versions. The upshot is that an application can call INT 21h, function 6300h on any Far East version of DOS at least since 3.2 (and 2.25) to obtain the DBCS lead byte table (verified with MS-DOS/V 6.2).

It is a mystery why Microsoft/IBM didn’t document this fact. In all official DOS 3.3 and later references, INT 21h, function 63h is marked as reserved. Did Microsoft want to force programmers to utilize  the newer INT 21h, function 65h API? But how were they supposed to get the information on DOS versions prior to 4.0? There seems to be no good reason not to document the API, as there’s nothing secret or potentially subversive about it.

It would seem that an application might then simply call INT 21h, function 6300h to obtain the DBCS lead byte table and just be done with it, documented or not. The catch is that on US versions of DOS 3.x, the INT 21h, function 6300h API simply does nothing. It does not indicate an error, it just returns without modifying any registers or flags.

So how to get the DBCS lead byte table without complicated DOS version checks and guesswork? It’s actually easy. The DS:SI registers should be set to 0:0 before calling the API. If a valid DBCS lead byte table is returned, the DS:SI registers will be modified (the table can’t possibly be stored at 0:0). If DS:SI are still 0:0, the API is not implemented, the DOS is not a Far East version, and hence there are no double-byte characters and no lead byte table.

Note that INT 21h, function 6300h is also implemented in the OS/2 2.0 DOS box and in NTVDM.  It is likewise implemented in DR DOS, at least versions 6.0 and later. Interestingly, the API is not documented in the DR DOS technical reference either, but the DEBUG.EXE utility in Novell DOS 7 displays an appropriate description (“DOS: Two byte chars”) when the INT 21h service is about to be executed.

Update: Last week I obtained a copy of Developing Applications Using DOS by Christopher, Feigenbaum, and Saliga. While it is not an official DOS reference book, it was written by the engineers who led the development of DOS 4.0 at IBM and it is clearly an unusually well informed book. INT 21h, function 6300h is documented in detail and marked as a DBCS-only function introduced in DOS 3.2. The function is labeled as “published”, that is supported by future DOS versions. There’s a general note that DBCS functions return an invalid function error on non-Asian DOS versions, but no specific remarks for INT 21h, function 6300h with regard to DOS versions.

This entry was posted in DOS. Bookmark the permalink.

66 Responses to Is it so hard to document things?

  1. techfury90 says:

    How much more RAM did it use? That might have been an issue for DBCS users. I can’t vouch for Korean DOS, but PC-98 DOS is absolutely huge. The 640k-1MB range is less fun too. Not much room for contiguous UMBs because of option ROMs and graphics VRAM bitplane 3.

  2. Michal Necasek says:

    As far as I can tell, the difference between 2.11 and 2.25 is that the latter supports INT 21h/63h, and also has better (proper?) support for DBCS characters in the console drivers. There is a concept of interim characters and DOS handles them.

    Hmm, the MS-DOS 3.21 OAG (OEM Adaptation Guide) suggests that 2.25 also supported “bigfat” (16-bit FAT) disks, which 2.11 did not support. I have not seen this mentioned elsewhere (as a DOS 2.25 feature), but this should be reliable information.

  3. dosfan says:

    Where does the MS-DOS 3.21 OAG suggest that MS-DOS 2.25 supported bigfat ? I assume you’re referring to the file 050187AG.DOC. I see mentions of function 63h and interim character support for the console driver but nothing else.

    At this point one has to wonder if any OEMs actually distributed MS-DOS 2.25. Perhaps Microsoft was asked to do the work, did it but the requesting OEM backed out and Microsoft documented it anyway in hopes that some other Far East OEM would be enticed to use it.

  4. Michal Necasek says:

    Yes, that document. Chapter 7B, Writing the FORMAT Module for MS-DOS 2.25/3.10.

    Yes, it is possible MSFT did the work and it ended up not being distributed. It’s also possible that it was distributed but really only in Japan/Korea (that’s a given I think) and in numbers that made it completely irrelevant. Without some significant Japanese/Korean expertise it’s hard to say, but the lack of any real product mention is curious.

  5. dosfan says:

    That documents says this:
    HIGHLIGHTS OF CHANGES IN 3.XX OVER PREVIOUS 2.XX FORMAT VERSION:

    o FORMAT is now designed to be an .EXE file.

    o The FBIGFAT variable has been introduced for 16-bit
    FAT support.

    o ALLOCATEFAT routine allows space for the FAT to be
    dynamically allocated.

    That sounds like they’re saying these were features new to DOS 3.x which is known to be the case.

    You’re probably right, if anyone in Japan or Korea actually got MS-DOS 2.25 it was likely little used. At this point the only copy of it is probably in Microsoft’s archive which even Paul Allen can’t get access to for his computer museum.

  6. techfury90 says:

    Didn’t need to buy that ASCII MS-DOS Programmer’s Handbook. Found a PDF on archive.org. It doesn’t mention anything above AH=57h.

  7. techfury90 says:

    We might have an interesting possible candidate: the MS-DOS 2.11 package offered by Epson for their PC-98 clones. http://island.geocities.jp/cklouch/column/pc98bas/epsonpcdos.htm says that it shipped June 1988, and has a particularly interesting note that it supports “extended format” (that’s PC-98 for “HDD with partition table instead of superfloppy”) HDDs of 40 MB or less.

    It would also appear that a second revision from 1989 exists: http://j02.nobody.jp/computer/epson/msdos211/t.html (screenshots and pictures of the package)

    Epson’s PC-98 clones didn’t come out until 1988, so it would make sense that this “2.11” could potentially be 2.25 in disguise, assuming the theory is true.

  8. Michal Necasek says:

    If MS-DOS 2.25 reported itself as 2.11 (which does not sound crazy to me), then it could certainly fly under the radar that way. MS-DOS 2.11 did support hard disks, at least theoretically with up to 32MB partitions, but FAT12 only — so huge clusters. In 1988 they really ought to have been using something better.

    So… any surviving disk images?

  9. techfury90 says:

    Yeah, Epson shipped a “3.1” a few months later. Then they came out with Epson DOS 4, which is curious because there was no official NEC 4.0 release (NEC skipped from 3.3, actually a fork of 3.21, to 5.0)

    Regarding images: no luck finding them. I’m going to be keeping an eye out at the usual sources for a physical copy. It seems like the kind of thing that shouldn’t be too difficult to find with patience. Japan does seem to do better at being able to somehow acquire a physical copy, at least.

  10. MiaM says:

    Maybe I’m showing my lack of knowledge about east asian charachters, but what about chineese? At the time mainland China seems unlikely as a market, but what about Taiwan?

    B.t.w. BBSes were also a thing in Europe. But online services were afaik only a thing in France as their monopoly telco at the time understood what an online service really should contain. They offered customers to skip the printed catalogue and instead somehow hook up to their Minitel service, and it contained stuff that were actually useful for other people than those in the finance / corporate leadership sector. This is btw afaik also the reason why the internet caught on rather slowly in France – the only thing dial-up internet offered was better graphics with slower loading speeds. All the stuff we take for granted today like buying tickets online e.t.c. were already there in their Minitel system. (This is based on what I’ve read in various places. Can’t verify all details due to not understanding french).

    Just looking at some old lists of how Fidonet were structured at various stages says something about how big or small thing BBS’es were at the time. But it doesn’t say anything about the ratio between so called “serious” BBS’es and “warez” BBS’es. (They were almost never the same as you were required to use real names on Fidonet and noone used or wanted to use real names in the “scene”. In some rare cases two different BBSes ran on the same computer).

  11. MiaM says:

    Btw there seems to be a bug on this web site. When I post a comment I get redirected to the first comment page even if the comment is on the second page. The same bug manifests itself in the links from the RSS feed of comments ( http://www.os2museum.com/wp/comments/feed/ ).

  12. Michal Necasek says:

    Sounds like a WordPress thing, not something I’m keen on fixing, sorry…

  13. Michal Necasek says:

    Mainland China was I think completely ignored in the 1980s. They used generic PCs with homebrew Chinese add-ons. Taiwan/HK probably wasn’t a big enough market and piracy was rampant. Japan and Korea were completely different (and separate) cases. I know there were Chinese versions of PC DOS 6/7 but not sure about anything older. Now I’m not sure if MS-DOS even had any Chinese variant or if it was just Japanese and Korean.

    My understanding of Minitel is about the same. Way ahead of its time, delayed the onset of Internet in France because it did much of the same things.

  14. Yuhong Bao says:

    I think there was a PRC version of MS-DOS 6.22 at least, and PRC Win9x still shipped with this stuff.

  15. Michal Necasek says:

    Looks like there was, and didn’t make much of a splash. I’m told most users ran English MS-DOS with various homebrew add-ons, and those were better/more popular than Microsoft’s eventual Simplified Chinese MS-DOS 6.22 (and 6.21?). IBM had Traditional Chinese PC DOS 6.1 and 7, and also Simplified Chinese PC DOS 7.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.