Suppose you want to run the original 1981 vintage IBM Pascal 1.0 (supplied by Microsoft) on a PC that is less than 30 years old. Upon execution, PAS1.EXE may well fail with the following error:
Error: Compiler Out Of Memory
Given that the compiler was designed to run on the original IBM PC and only required 128K memory, why is it failing on a system with a lot more? The real reason is of course not that there isn’t enough memory, the problem is that there’s too much. Let’s see how that works (or rather doesn’t work) exactly.
IBM Pascal 1.0 suffers from a problem that is common to a number of products built with the Pascal compiler, specifically programs using the Pascal run-time startup code. That notably includes early versions of MASM as well as the Pascal compiler itself.
There are several variants of the run-time startup with slightly varying behavior, but the core problem is the same. The startup code calculates memory sizes and addresses as 16-bit quantities in units of paragraphs (16 bytes). That way, a 16-bit value handles the entire 1MB address space of an 8086, and in fact follows the implementation of 8086 segment registers.
The Pascal run-time relocates part of the loaded executable image towards higher addresses to make room for the heap and stack. It attempts to make the data segment (heap, stack, static data, constants) up to 64K in size, but will settle for however much is actually available.
The problem with the Pascal run-time is that it uses a signed comparison. That is logically wrong because a PC can’t have a negative amount of memory. The signed comparison may produce incorrect results. Let’s consider some of the variants.
Broken One Way
Variant A, found in programs produced with IBM Pascal 1.0, first calculates the paragraph address of the bottom of the data segment in the executable (load address plus size of memory preceding the data segment). 64K (or 4096 paragraphs) is added to that value, and it is then compared against the highest available paragraph address (read from offset 2 in the PSP). If there is more than 512K conventional memory, the highest available paragraph will be 8000h or higher, and will be interpreted as a negative signed value. That will be less than the desired top of the data segment.
If the signed comparison produces the wrong result, the run-time thinks there is less memory available than the desired maximum, and tries to use as much as it believes there is. That may not be enough for the application, which may then fail with an out of memory error.
Broken Another Way
Variant B is slightly different. It affects for example IBM MASM 1.0 and has been analyzed here. In this case, the startup code takes the highest available paragraph address from offset 2 in the PSP, subtracts the base address of the data segment from it, and then does a signed comparison against 64K (4096 paragraphs) to see if the maximum is available for the data segment.
The failure mode of this variant does not depend on the highest available paragraph address but rather the amount of free memory.
Symptoms and Workarounds
If the user is lucky (sort of), the affected program will report an out of memory error and not cause any harm beyond that. Because the run-time is not careful enough, it is possible to end up with a dynamically allocated stack smaller than the one built into the executable, and then the executable will hang (loop endlessly) while trying to print the “out of memory” error message.
Further analysis of problems with the Pascal run-time may be found here.
The most complete fix for the problem is to patch the executable and replace the problematic signed comparison with unsigned (for example replace JLE instruction with JBE). That is also the most difficult fix because it requires analysis of the start-up code.
A less intrusive, less complete, but much simpler and usually sufficient fix is to change the EXE header to reduce the maximum allocation. That way, instead of trying to grab all available memory, the executable will only get (for example) 64K, which will almost certainly prevent the overflow.
In some cases, the LOADFIX utility may also change the behavior by loading the executable higher in memory. This does not require modification of the executable but also may not help at all.
The previously referenced article claims the following: The IBM Personal Computer MACRO Assembler”, also known as MASM, published by Microsoft and IBM since 1981, was one the firsts[sic] Assembler programs to run under MS-DOS / PC DOS on IBM PC or compatible computers. Lot’s[sic] of code was written for MASM, notably the MS-DOS kernel itself and ROM-BIOS code. MASM is therefore of historical importance in the field of personal computing.
While the historical importance of MASM is indisputable, the quoted text is slightly misleading. The BIOS of the early IBM PCs was written on Intel development systems (not on PCs for some funny reason) and built using Intel’s development tools, notably the Intel ASM86 V1.0 assembler. DOS 1.x was built using Tim Paterson’s SCP assembler, named simply ASM. Microsoft’s MASM was clearly an important product, but it played no role in the initial development of the PC’s ROM BIOS and DOS. It was used for DOS 2.0 and later, as well as the IBM PC ROM BIOS since circa 1983.
It is almost certainly true that (IBM) MASM was the first assembler commercially available specifically for the IBM PC. Intel’s ASM86 was only ported to DOS in the mid-1980s. SCP’s ASM was not sold by Microsoft or IBM, although it was almost certainly the first assembler which ran on the IBM PC by virtue of the extremely close relationship between SCP’s 86-DOS and PC DOS.
Trivia: The IBM Personal Computer Pascal Compiler V1.00 executables (PAS1.EXE and PAS2.EXE) do not have the typical ‘MZ’ signature in the EXE header at the very beginning of the file. Instead, they have a ‘ZM’ signature. That is considered equivalent to ‘MZ’ by all “true” DOS implementations (not counting DOS 1.x, where COMMAND.COM loads EXE files and does not check the signature). The second word in the PAS1 and PAS2 EXE header also does not indicate the number of bytes in the last page of the executable, but rather the version of the linker used to create it.