Learn Something Old Every Day, Part III

Posted on September 8, 2021 by Michal Necasek

As part of a hobby project, I set out to reconstruct assembly source code that should be built with an old version of MASM and exactly match an existing old binary. In the process I learned how old MASM versions worked, and why programmers hated MASM. Note that “old versions” in this context means MASM 5.x and older, i.e. older than MASM 6.0.

The way old MASM works is relatively straightforward but its documentation often explains it very poorly or not at all. MASM is a two-pass assembler, and that indirectly explains almost everything about its quirks. This is different from more modern N-pass assemblers which automatically run multiple passes to resolve ambiguities.

The core of the problem is that MASM tries to be clever, but it’s not nearly clever enough. It is very questionable whether MASM’s cleverness is a solution or a problem; other assemblers are stricter, relying on programmers to resolve ambiguities. This perhaps puts slightly more of a burden on the programmer but results in more readable, consistent source code.

Most ambiguities result from the fact that like most assemblers, MASM does not require symbols to be declared before they’re referenced. In the first pass, MASM generates “provisional” code, making guesses about what unknown symbols are. At the end of the first pass, all symbols are known (if they’re not, the assembly will fail).

In the second pass, MASM applies what it learned in the first pass and generates the final object code. If the guesses made in the first pass turn out to be incompatible with the second pass, MASM will report the dreaded “phase error”. More about that later.

The crucial thing to understand is that in the first pass, MASM generates enough object code to resolve all offsets, i.e. at the end of the first pass, MASM will know for each symbol defined in the source code at which offset it will be located, because it will have determined how big all generated code and data is.

Now comes the “cleverness”. For example if MASM sees a JMP to an unknown label, it will assume a 16-bit near jump, i.e. a three-byte instruction. In the second pass, MASM may find out that the jump target is within +127/-128 bytes and generates a short jump, a two-byte instruction. Crucially, the third byte will be replaced by a NOP so that the instruction still effectively takes up three bytes.

MASM might also find out that the label is in a different segment, requiring a far jump. In that case, the jump instruction will not fit within three bytes and a phase error will result.

The programmer also has the option of writing ‘JMP SHORT xxx’ rather than ‘JMP xxx’. In that case, MASM will always generate a two-byte short jump, and possibly fail with an error if the target is not within the range of a short jump.

This is where those ‘NOP after JMP’ instructions come from. It is MASM (or perhaps some other assembler/compiler) turning a near jump into a short jump but not truly reducing the instruction size.

If the jump target is in another segment, the programmer may also write ‘JMP FAR PTR xxx’, telling MASM to generate a far jump and avoiding a phase error if the target label is in another segment but not yet known in the first pass.

Interestingly, there is at least one situation where the NOPs can be useful, especially because there does not appear to be any way (short of manually emitting opcodes) of telling MASM to generate a near jump when a short jump is possible. The BIOS component of DOS 1.x uses a CP/M inspired jump table where the “exported” interface is accessed by calling into some known base address plus an offset which is the function number times three (that being the JMP instruction size). The dispatch table looks like this:

DISPATCH:
    JMP FUNC0
    JMP FUNC1
    JMP FUNC3

This would be conceptually invoked as ‘CALL FAR PTR DISPATCH+(FUNC*3)’ because the dispatch table is assumed to consist of a sequence of near jumps. If MASM turns one or more of those jumps into short jumps but pads them with a NOP, the dispatch table will still work. If an assembler ends up producing only 2-byte jumps without padding, the dispatch table will go up in flames.

There are other situations where NOPs can be generated. For example ‘MOV DATA, 5’ will be byte or word sized, depending on the type of ‘DATA’. If ‘DATA’ has not yet been seen in pass 1, MASM will generate a 6-byte MOV instruction, big enough for a word-sized move. In pass 2, MASM may know that ‘DATA’ is a byte variable; in that case, the instruction will be reduced to 5 bytes, but again followed by a NOP.

This situation is exactly what ‘BYTE PTR’ can be used for. When ‘DATA’ ends up being a variable with a known size (byte or word), MASM will set the MOV instruction size based on that and not complain. The programmer can write ‘MOV BYTE PTR DATA, 5’ to prevent MASM from guessing the instruction size, or to override what MASM would do.

There are other situations where MASM can be unpleasantly clever. Remember those ASSUME directives? They are quite important.

Consider a situation where everything (code and data) is in a single segment named CODE, and the source file contains an ‘ASSUME CS:CODE’ directive but not more. If you write ‘MOV BYTE PTR VAR,1’, you may get a phase error depending on whether ‘VAR’ has been seen or not. Why is that?

MASM is clever and if it knows that VAR is in the code segment, it will automatically generate a CS segment override. But if it has not yet seen ‘VAR’ in the first phase, it won’t leave room for the prefix, and in the second phase it’ll report a phase error when it figures out that a segment prefix is needed but there’s no room for it.

Again, an explicitly coded segment prefix (e.g. ‘MOV BYTE PTR CS:VAR,1’) avoids this situation. Programmers need to keep this cleverness in mind because if they forget to say ‘ASSUME DS:CODE’ (assuming the DS segment register does in fact point to the CODE segment containing the data items), MASM will helpfully generate unnecessary CS segment overrides.

Perhaps the most questionable MASM feature is guessing that when possible, a label refers to the value at the label’s address. Thus ‘MOV AX,WORD PTR [VAR]’ can be shortened to ‘MOV AX,[VAR]’, because MASM reasonably assumes that moving to AX means a word-sized operation, but the same result can also be achieved with just ‘MOV AX,VAR’. This leads to a confusing syntax where brackets sometimes must be used as a dereferencing operator and sometimes they’re optional.

I’m not sure what problem Microsoft was trying to solve by making the syntax so vague. It is clearly inconsistent because ‘MOV AX,BX’ and ‘MOV AX,[BX]’ are two different things, yet ‘MOV AX,VAR’ and ‘MOV AX,[VAR]’ is (often) the same. It’s the kind of syntactic sugar that’s bad for you.

It’s even worse because there are differences between MASM versions in this area. For example, MASM 1.10 will assemble ‘MOV AX,VAR’ the same way regardless of how VAR is defined. But IBM MASM 2.0 accept it if only we have ‘VAR DW 0’ and report an error (“Operand types must match”) if ‘VAR DB 0’ is seen instead. MASM 5.10A flags the situation as a warning (again “Operand type must match”) and produces the same code as old MASM 1.10. Microsoft appears to have gone back and forth on this, probably because the original MASM behavior was unhelpfully vague but too much existing code relied on it.

Some other assemblers (e.g. SCP’s ASM) have unambiguous syntax and ‘MOV AX,VAR’ will correspond to MASM’s ‘MOV AX,OFFSET VAR’; if dereferencing is desired, it must be made explicit with brackets.

Much of this used to be documented in old MASM manuals, like the one here. For whatever reason, newer MASM documentation (e.g. MASM 5.0 User’s Guide) does not bother explaining these seemingly small but very important details which are tied to MASM’s two-pass processing. The behavior is not difficult to grasp once the basics of MASM operation are understood, but without that, MASM may appear to behave in a very arbitrary and capricious manner.

This entry was posted in Assembler, Development, Microsoft, PC history. Bookmark the permalink.

11 Responses to Learn Something Old Every Day, Part III

DOS says:

September 10, 2021 at 5:29 pm

Did Borland document any of that behaviour better than Microsoft due to Turbo Assembler optionally emulating it?
Michal Necasek says:

September 10, 2021 at 5:56 pm

I don’t recall seeing this clearly explained in Borland’s documentation, but I could have missed it.

And it’s not like Microsoft never documented it, more like it became some kind of a lost art. You’d think a nearly 500-page MASM 5.0 Programmer’s Guide could spare a few paragraphs explaining the MASM passes, but no. Phase errors are mentioned, but not explained in depth. On the other hand, IBM’s MASM 1.0 manual from 1981 actually explains the two passes reasonably well.
Wilmer Ricciotti says:

September 15, 2021 at 6:20 pm

Happy to see that I’m not the only person bothered by the fact that ‘MOV AX,LABEL’ and ‘MOV AX,[LABEL]’ mean the same thing (except that in some contexts they don’t). Very questionable choice: complicating the assembler while at the same time also making it more difficult to understand for the programmer.
Olaf 'Rhialto' Seibert says:

September 16, 2021 at 9:51 pm

As someone used to other assemblers, phase errors and two-pass-ness don’t surprise me. What does surprise me, is a couple of other things.
– the use of the ‘H’ suffix to hexadecimal numbers, instead of a ‘$’ (or more modern, ‘0x’) prefix. By the time you get to the H, you have to re-interpret whatyou just read. And often it doesn’t even stand out very much, if the value is for example AAH.
– MOV going the wrong way (destination operand on the left)
– and indeed the “does it dereference or not?” question, but slightly differently: mov label,r0 would *always* move the memory contents at address “label” to r0; if you want the value of the label, indicate an immediate operand with mov #label,r0.
Michal Necasek says:

September 17, 2021 at 11:32 am

Correction: AAH is not a hexadecimal constant, 0AAH is. I’m not sure where that syntax came from but it was clearly used in Intel’s earlier 8080/8085 assemblers. The syntax may be odd, but it’s less confusing than, say, octal constants in C.

All of these features of MASM can be blamed squarely on Intel. Microsoft just tried to keep MASM compatible with Intel’s earlier ASM86, which was not exactly a crazy objective as such but it meant adopting ASM86’s quirks.

The dst, src syntax was hardly new and again, Intel’s earlier CPUs used it too. I don’t think it is any more objectionable than big endian vs. little endian. What I do find objectionable is assemblers that use a syntax wildly different from the CPU vendor’s documentation.

It is interesting that Intel’s (and Microsoft’s) assemblers were obviously inspired by DEC’s macro assemblers, but DEC pretty consistently used src, dst syntax. MIPS is an example of an architecture roughly contemporary with the 8086 that used the dst, src syntax. IBM’s System/360 is the obvious major architecture that also used dst, src syntax. The concept was clearly not foreign to programmers, as evidenced by the dst, src syntax in C library routines.

The dereferencing weirdness likewise came from ASM86. Conceptually [FOO] is equivalent to [FOO+0] or FOO[0], i.e. the first element of an array. I do not know why Intel chose to do this. I will note that in 8080/8085 assembly, MOV (move register/memory) and MVI (move immediate) were separate instructions. For the 8086, Intel decided to make MOV the all-singing, all-dancing instruction which created ambiguity.

Intel clearly though that making ‘MOV AX,FOO’ semantically different from ‘MOV AX,OFFSET FOO’ was a good idea. All those wanting to be compatible with existing source code (Microsoft, DRI, and many more) didn’t have much choice. At the same time there was clearly a dislike of the ambiguous syntax, as evidenced e.g. by the early SCP assembler or later TASM ideal mode.
Alex Czarnowski says:

November 28, 2021 at 12:23 am

Few comments:

1. As far as I can recall Borland never did completly explained all of MASM quirks in TASM documentation. Keep in mind that TASM documentation has been shrinking from version to version. Hard to blame Borland since TASM was more of backend product called by their compilers suites. Secondly it was hard to keep up with all DOS, OS/2 and finally Windows changes in assembler documentation. There were books on TASM that were explaining most of those topics a lot more extensively and up-to-date.

2. The $ and src, dst notations comes from AT&T and non Intel assemblers, including MOS, Motorola etc. I guess AT&T notation came from their Unix adventure but I can be very wrong here. In fact there were (and still are) x86/x64 compilers supporting AT&T syntax (gnu for example). ARM in all its incarnation and modes is using src, dst (sometimes adding 3rd operand) syntax as well.

3. There could be another assembler that can have roots in Intel assembler to some extent and that is brilliant (for the time being) A86 assembler by Mr. Eric Isaacson. Back at XT and AT times this was the fastest assembler available AFAKI. Mr. Isaacson has been working for Intel.

4. TASM Ideal mode has never been widely used and accepted unfortunately, but indeed it did it best to fix all sorts of MASM idiosyncrasies but I think the main objective has been to bring asm programming more up-to-date with higher level languages. Hence support some OOP ideas.

5. Despite “compatibility” with different MASM versions it went only as far as syntax has been concerned. I remember a friend of mine loosing few evenings trying to link and later debug an EXE consisting of obj files from MASM and TASM. The final fix has been switching TASM source code to MASM and using single linker.
Michal Necasek says:

November 29, 2021 at 7:21 pm

Eric Isaacson was one of the ASM86 authors so any similarity was not coincidental.

The “GNU” x86 assembler syntax definitely came from AT&T, but I’m actually not sure if they were the first to use it. In the DOS world it was unheard of, and the most widespread UNIX (i.e. XENIX) used MASM, but then again there’s this on Intel 286 XENIX. I’m having trouble digging up information about the assembler in the earlier Intel XENIX 86.

The woe with mixing assemblers does not surprise me… MASM especially uses many obscure features of OMF, and the linker has to do everything just right. I also know that even Microsoft’s MASM 6.x is only so-so compatible with MASM 5.1, for example. IBM/MS had a lot of code in OS/2 that had to be built with MASM 5.1, there was no chance 6.x would produce the expected result.
Michal Necasek says:

November 30, 2021 at 2:02 pm

I should add that the Intel 286 XENIX assembler looks very AT&T-like but does not use src, dst but rather standard Intel dst, src syntax. The syntax looks fairly unusual to me.
MiaM says:

December 5, 2021 at 8:13 pm

Re src, dst v.s dst, src:
Both are fine as long as you use an appropriate mnemonic.

MOV is IMHO an inappropriate mnemonic for dst, src.

Like who uses the word in real life language with the destination first (at least without explicity saying both the word “from” and “to”, which makes the syntax cumbersome)? You can’t put something down before you have picked it up, thus it seems silly to have dst, src.

Zilog changed the syntax from MOV to LD in order to make the assembler code look more sane.

6800, 6502 and 6809 all used variants of LD dst, src (although the destination were part of the mnemonic, thus for example LDA src). 68000 and VAX used MOVE src,dst.

After thinking about it, move isn’t really a good mnemonic at all since it really does a copy and not a move.
PC history related anecdote: If you use automatic sorting rules in Microsoft Outlook 97 and an incoming mail matches more than one rule with the operation “move” to different destinations, it “moves” the mail to all destination. Move does thus not move a mail, it actually copies it to the destination and deletes the mail from the inbox folder if it exists there. There is an updated version where the “solution” is a tick box to tell the auto processing to skip any further matching rules.
Michal Necasek says:

December 6, 2021 at 11:56 am

I thought Zilog had to change the mnemonics because they were not allowed to use Intel’s…

Personally I found ‘load/store’ to be no better than ‘move’ when I first encountered it. Especially ‘store’ could be a store to a register as much as a store to memory. Sure, sure, “everyone knows” what’s a load and what’s a store, but that isn’t because the words are somehow more meaningful.

Interestingly the people who designed the memmove() C function also thought that “move dst, src” made sense. It’s probably one of those things that are about as “obvious” as little endian vs. big endian.
ForOldHack says:

January 5, 2022 at 12:28 pm

I have not had the time to compare the a.out of the different versions of XENIX, nor will I ever have the ability to check an a.out from the VAX compiler(s), but this is what was said:

“Internally XENIX Compilers for the VAX supported 7 or 8 architectures, 8086, 88000, 68000, 80286 ISA, 80286 MC, 80386 ISA, you had the Bus Wars, the UNIX wars and the compilers all going on. ”

( I doubt this as bragging, since the 68K version was shipping, along with the 8086, long before MicroChannel was announced, and It would also indicate that the 80286 ISA programmers would know about the 80386, which I think they did, but ignored it. )

Which I guess is because of this:

“XENIX was originally developed on a DEC Virtual Address Extension (VAX) running the Virtual Memory System (VMS) and a PDP-11 running UNIX V7, albeit now using Microsoft’s own in-house minicomputers, and then converted into assembly language specific to the new 16-bit Motorola 68000 and Intel 8086 microprocessors.”

http://www.softpanorama.org/People/Torvalds/Finland_period/xenix_microsoft_shortlived_love_affair_with_unix.shtml

Somewhere I read about how the first VAX version was developed, by a Canadian company:

“Microsoft purchased a license for Version 7 Unix from AT&T in 1979, and announced on August 25, 1980 that it would make it available for the 16-bit microcomputer market.

The initial development of Xenix was done by Human Computing Resources Corporation of Toronto, Canada. The initial port of Xenix to the Intel 8086/8088 architecture was performed by The Santa Cruz Operation. ” ( HCR was acquired by SCO in 1990 )

https://microsoft.fandom.com/wiki/Xenix

Water under the bridge: My interest is which assembler was used to produce the files that did the serialization, that was dis-assembled.