Well Hello

So after some furious disassembling, assembling, and linking, things got this far:

Freshly built GW-BASIC

It took longer than it ought to have because although IDA is great, I couldn’t figure out how to make it work with GW-BASIC’s bizarre segment usage. The problem is that the BASIC data segment is tacked onto the end of the code, and during initialization, CS equals DS and variables have high offsets in the segment. But after initialization, CS no longer equals DS and DS points at the beginning of the data, which means variables are at low offsets within DS. I failed to convince IDA that the same data is accessed through completely different offsets at different points in the code. After a lot of trying and failing, I’m still unsure if I’m doing something wrong or if it’s really a situation that IDA can’t adequately handle.

At any rate, I ended up with a bit over four thousand lines of a GW-BASIC OEM module, lovingly borrowed from Compaq’s BASICA.EXE version 1.13.

The original most likely had the OEM code split into several modules, which is evidenced by small gaps in the code. Those would be caused by paragraph alignment of the segments.

Using the previously disclosed information, the source modules were all assembled with MASM 1.06 (no errors) and linked with LINK V2.01 (Large), executable dated 4-01-83. The order of the modules definitely matters, at least to some extent. The linker warns that there is no stack segment (true; the original Compaq executable doesn’t have one either) and shows one error about an offset width exceeding field with in the MATH module. While that error is also legit, the code is just written that way. Despite those issues, the linker still produces a working executable. Here’s the map file in case anyone is curious.

The linker error turned out to be somewhat significant. The offending code in MATH1.ASM is this:

        MOV     AL,LOW OFFSET $FAC

The linker correctly complains that it can’t shove the 16-bit $FAC offset into an 8-bit register. What the authors probably actually intended is this:

        MOV     AL,BYTE PTR $FAC

Now, where it gets really interesting is that out of the four GW-BASIC 1.x executables examined (Compaq BASICA.EXE 1.13 and 1.14, Eagle GWBASIC 1.10, Corona GWBASIC 1.12.03), three in fact use the latter, corrected version of the code. Only Compaq’s BASIC 1.13, clearly the oldest of the bunch, uses the buggy code sequence which matches the source code released by Microsoft.

As previously mentioned, not one of those four binaries is an exact match for the released source. Compaq’s version 1.13 is the closest, but not identical. At this point it is anyone’s guess if any OEM ever released a GW-BASIC version that was built from the exact source code that Microsoft recently published.

Note: The OEM module is not yet in a shape suitable for publishing so don’t ask.

This entry was posted in Compaq, Microsoft, PC history, Source code. Bookmark the permalink.

5 Responses to Well Hello

  1. Rich Shealer says:

    It sounds like you did a lot of work to get this far.

    How were you able to isolate the sections that made up the OEM code? Or is it by nature a solid section of the executable?

  2. Michal Necasek says:

    I started by matching the published source to the executable. Once I matched all the source files, only the OEM code was left. It just so happens that in all the four executables I looked at, the OEM code is all in a single contiguous section. I don’t think it has to be done that way but it’s a natural way to do it. I only really looked at the Compaq OEM code in depth and I’m convinced they had at least three source modules, maybe more. But the only way to be reasonably sure is if there happens to be paragraph alignment that is not otherwise achievable with old MASM.

  3. buricco says:

    I’m kinda curious how much work it’d be to reconstruct 2.01 or 2.02 from this, even though as soon as I did “files” with tkchia’s reconstruction I knew I was dealing with a 1.x BASIC.

  4. Michal Necasek says:

    Hard to say. I don’t have a good sense of how much changed between GW-BASIC 1.x and 2.x. I expect it would be a sizable amount of work. It might be easier to ask Microsoft to look for the source 🙂

  5. Richard Wells says:

    MS may not be able to provide the source. Later revisions of GW-BASIC have the copyright notice that portions are from Compaq and thus releasing anything would require having HP approval or spending a lot of effort ensuring the released source is clean. MS frequently accepted code from OEMs in exchange for reducing the royalty payments. A side effect of one of those deals got Greg Whitten hired by MS.

    GW-BASIC 2 was about as different from GW-BASIC 1 as extended BASIC differed from advanced. In addition to supporting DOS 2 features like directories and redirection, there is also support for better than CGA video modes and many minor changes. The Corona manual has a section listing the more significant ones though I have not found the README.DOC that is supposed to cover all the little fixes that duplicating GW 2 would require.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.