The A20-Gate Fallout

Posted on April 13, 2018 by Michal Necasek

A recent post explored the motivation (i.e. backwards compatibility) to implement the A20 gate in the IBM PC/AT. To recap, the problem IBM solved was the fact that 1MB address wrap-around was an inherent feature of the Intel 8086/8088 CPU, but not the 80286 and later models—but a number of commercial software packages intentionally or unintentionally relied in the wrap-around.

Interestingly, it is obvious that the address wrap-around was much better known and understood in 1981 than it was in the 1990s. For example in 1994, the usually very well informed Frank van Gilluwe wrote in Undocumented PC (page 269): A quirk with the 8088 addressing scheme allowed a program to access the lowest 64KB area using any segment:offset pair that exceeded the 1MB limit. […] Although there is no reason for software to ever use this quirk, bugs in a few very old programs used segment:offset pairs that wrap the 1MB boundary. Since these programs seemed to work correctly, no actions were taken to correct the defects.

Yet it is known that Tim Paterson quite intentionally used the wrap-around to implement CALL 5 CP/M compatibility in QDOS around 1980, and Microsoft Pascal intentionally used it in 1981. In both cases there were arguably very good reasons for using the wrap-around.

Intentional or not, software relying on 8086 address wrap-around was out there and important enough that by the end of 1983, IBM had implemented the A20 gate in the upcoming PC/AT. But did they have to do that?

Possible Alternatives

Let’s explore several what-if scenarios which would have preserved a high degree of compatibility with existing software without requiring the A20 gate.

The CALL 5 interface was a bit of a tough nut to crack because it was a clever (too clever?) hack to begin with. One option would have been to place an INT3 instruction at offset 5 (the one documented single-byte interrupt instruction), or with slight restrictions, use a two-byte interrupt instruction. That would have avoided the need for wrap-around. It might not have been nice but it would have been manageable.

The Pascal run-time was a much nastier problem. Existing variants of the start-up code might have been detected and patched, but it was reasonable to assume that modified variants were out there. It was also reasonable to assume that Microsoft Pascal was not the only piece of code guilty of such shenanigans. The bulletproof solution would have been simple but probably unpalatable—force all or some applications to load above 64K. Any remaining free memory below 64K might still have been available for dynamic allocation. If this option was considered at all, it was likely thought too steep a price to pay for the sins of the past (that is, the 1981-1983 era).

A serious and quite possibly fatal shortcoming of software workarounds was that they required modified software. In an era where bootable floppies were the norm, and even third-party software was sometimes delivered on bootable floppies, a solution which did nothing for users’ existing bootable diskettes was probably considered a non-solution.

A Solution and a Problem

The A20 gate was easy to implement in the PC/AT because there were no caches to contend with. Simply forcing the output of CPU’s address pin A20 to logical zero was all it took to regain the address wrap-around behavior of the 8086. The switch was hooked up to the keyboard controller which already had a few conveniently available output pins.

The implementation was clearly an afterthought; in the PC/AT days, DOS didn’t know or care about the A20 line, and neither did DOS applications. DOS extenders weren’t on the horizon yet. There was no BIOS interface to control the A20 gate, but IBM probably didn’t think there needed to be one—the INT 15h/87h interface to copy to/from extended memory took care of the A20 gate, and the INT 15h/89h interface to switch to protected mode also made sure the A20 gate was enabled. Everyone else was expected to run with the A20 gate disabled.

OEMs like HP or AT&T similarly didn’t think that A20 gate control was something important and devised their own schemes of controlling it, incompatible with IBM’s. There was no agreement across implementations on what the A20 gate even does—some really masked the A20 address line (PC/AT), some only mapped the first megabyte to the second and left the rest of the address space alone (Compaq 386 systems), and others implemented yet other variations on the theme. The A20 gate effects are likewise inconsistent across implementations when paging and/or protected mode is enabled.

The real trouble started around 1987-1988, with two completely unrelated developments. One was the advent of DOS extenders (from Phar Lap, Rational, and others), and the other was Microsoft’s XMS/HIMEM.SYS and its use of the HMA (High Memory Area, the first 64K above the 1M line). In both cases, there was a requirement to turn the A20 gate on in order to access memory beyond 1MB, and turn it off again to preserve compatibility with existing software relying on wrap-around (which, thanks to EXEPACK, only proliferated since the PC/AT was introduced).

The big problem which implementors of DOS extenders and HIMEM.SYS faced was that there was no BIOS interface to control the A20 gate alone, and no uniform hardware interface either. A lesser problem (but still a problem) was that switching the A20 gate on or off through the keyboard controller (KBC) wasn’t very fast, so even on systems which did provide a compatible KBC interface, any faster alternative was well worth using.

The solution was to do it the hard way. Some versions of HIMEM.SYS for instance provided no fewer than a dozen A20 handlers for various systems, and provided an interface to install a custom OEM handler. Around 1990, OEMs realized that this wasn’t workable and only hurt them, and stopped inventing new A20 control schemes. Effectively all systems provided PC/AT-style A20 gate control through the KBC, and typically also PS/2-style control through I/O port 92h (much faster than the KBC).

There were complications for CPU vendors, too. Intel was forced to add the A20M# input pin to the i486—the CPU needed to know what the current state of the A20 gate was, so that it could properly handle look-ups in the internal L1 cache. This mechanism had to be implemented in newer CPUs as well.

Cyrix faced even greater difficulties with the 486DLC/SLC processors designed as 386 upgrades. These processors had 1K internal cache (8K in case of the Texas Instruments-made chips) and did also implement the A20M# pin, but needed to also work in old 386 boards which provided no such pin. That only left unpleasant options available, such as not caching the first 64K of the first and second megabyte.

The life of system software was also quite complicated by the A20 gate. For example memory managers like EMM386 needed to run the system with the A20 gate enabled (otherwise on a 2MB system, no extended memory might be available!) but emulate the behavior software expected. EMM386 needed to trap and carefully track A20 gate manipulation through the KBC, as well as through port 92h. When the state of the A20 gate changed, EMM386 had to update the page tables—either to create a mapping for the first 64K also at linear address 100000h (A20 gate disabled), or remove it again (A20 gate enabled).

Hindsight Is 20/20

From today’s vantage point, it is obvious that IBM should have just sucked it up back in ’84. Leave the the A20 address line in the PC/AT alone, and force software to adapt. That would have saved so much time, effort, and money to software developers and users over the subsequent decades. Complexity is expensive, that is an unavoidable fact of life.

It’s just as obvious that in 1984, the equation was different, and adding the A20 gate to the PC/AT was considered the lesser evil (relative to breaking existing software). Predicting the future is a tricky business, and back then, DOS extenders or the HMA probably weren’t even a gleam in someone’s eye. IBM likely assumed that DOS would be gone in a few years, replaced by protected-mode software which has no need to mess with the A20 gate (such as IBM’s own XENIX, released in late 1984).

By the time the A20 gate started causing trouble around 1987, reliance on the address wrap-around was much more entrenched than it had been in 1984, not least thanks to EXEPACK. At that point, the only practical option was to press on. There was no longer a company which could have said “enough with this nonsense”; hardware had to support existing software, and software had to support existing hardware.

Over time, the A20 gate turned into a security liability and modern CPUs deliberately ignore the A20M# signal in various contexts (SMM, hardware virtualization).

Many years and many millions of dollars later, here we are. DOS has been sufficiently eradicated that some recent Intel processors no longer provide the A20M# pin functionality. Some.

Even the folks at Intel clearly don’t understand what it’s for, and thus a recent Intel SDM contains amusing howlers like a claim that “the A20M# pin is typically provided for compatibility with the Intel 286 processor” (page 8-32, vol. 3A, edition 065). To be fair, other sections in the same SDM correctly state that the A20M# pin provides 8086 compatibility.

It is likely that over the next few years, the A20M# functionality will be removed from CPUs. Since legacy operating systems can no longer be booted on modern machines, it is no longer necessary. In an emulator/hypervisor, the A20 gate functionality can be reasonably easily implemented without hardware support (manipulating page tables, just like EMM386 did decades ago). Goodbye, horrors of the past.

This entry was posted in IBM, Microsoft, PC architecture, PC history. Bookmark the permalink.

93 Responses to The A20-Gate Fallout

dosfan says:

April 13, 2018 at 8:44 pm

IBM did define an A20 BIOS interface in their later PS/2 systems (INT 15h function 24h, with subfunctions 0-3) but to my knowledge no one else copied this as IBM wasn’t relevant by that time.
Michal Necasek says:

April 13, 2018 at 9:14 pm

Right, they did. The problem was that it came too late to be useful. If they had introduced such interface in the PC/AT, everyone would have used it. But I really don’t think the PC/AT designers anticipated that the A20 gate control would be in any way important.
Yuhong Bao says:

April 13, 2018 at 10:01 pm

@dosfan: I think some BIOS vendors at least did copy this eventually.
Cutter says:

April 13, 2018 at 11:40 pm

Seems to me that, in hindsight, the easiest way to fix the problem would have been in the 286 itself. When in real mode it wraps like the 8086 did, when in protected mode it doesn’t. There wouldn’t have been any ‘high memory’ anymore, and unreal mode might have been less useful, but there wouldn’t be a need for a gate.

One could argue that it’s actually a flaw in the 286 as it’s breaking compatibility with the 8086.
Yuhong Bao says:

April 14, 2018 at 12:06 am

Do note that the main difference between real and protected mode from the perspective of the 286 and later is how segment registers are loaded.
dosfan says:

April 14, 2018 at 12:11 am

Keep in mind that the 80286 was released in early 1982 approximately 2½ years before the IBM PC AT. Intel couldn’t have known that software would rely on quirks. Heck Intel originally defined interrupt vectors 05h-1Fh as reserved for future use but that didn’t stop IBM from using them anyway which of course caused conflicts.
Richard Wells says:

April 14, 2018 at 12:12 am

The easiest way to have fixed the problem would have been for DRI to include a memory size function in CP/M 2. That would have meant that Wordstar would not need to check the address of the CALL 5 jump which would have meant that CALL 5 in DOS would not need to use address wraparound which would mean the DOS 5 plan of only wrapping addresses during application initialization could have succeeded. Certainly would not have had the DOS HMA dance where every call to DOS requires setting and then resetting the A20 line.

All that work because DRI chose to save a few bytes in 1979 which DRI wound up having to give back in MP/M because of the need to have CALL 5 do two jumps (once to the top of available memory which in turn has a jump to the actual function dispatch).
dosfan says:

April 14, 2018 at 12:57 am

You’re assuming that DOS programs actually used CALL 5 and they didn’t. I used DOS quite extensively including having worked on DOS itself and I have never come across a single program which used CALL 5 (supposedly WordStar 3.02 did but that’s unverified). The DOS 5 A20 toggling was done to support the EXEPACK loader and similar software.
Richard Wells says:

April 14, 2018 at 2:32 am

There is at least two DOS 1.0 programs that uses CALL 5; both by Paterson. Slightly obfuscated though
TRANS.ASM: The call to DOS is written as CALL SYSTEM with SYSTEM: EQU 5

ASM.ASM has a SYSTEM: function containing just CALL 5 and RET which the PRINT routines seem fall through. Interestingly, ASM.ASM also has INT calls to DOS though written as INT 33.

I have a few other suspect programs that I can’t see a call to INT 21h or 33 but that does not mean they use CALL 5. Finding DOS programs written in 1982 or earlier that were adapted from CP/M-80 is a trifle difficult.

Note that DOS 5 beta did make an effort to replace the wrap around address for CALL 5 with a no wrap equivalent. If no application used CALL 5, no reason to go through that work. Between that and the reset of wrap around upon program exit, it looks like the goal in DOS 5 would have been to have the address wrap working for program initialization to handle Pascal or EXEPACK and then turn off wrap around to make the HMA always available. Didn’t work, alas.
Sean McDonough says:

April 14, 2018 at 3:28 am

>legacy operating systems can no longer be booted on modern machines

Wait, what?
MiaM says:

April 14, 2018 at 3:52 am

A hypotetical solution they might could have added would be a permanent wrap-around for the first 64k above 1M. It would be annoying for manufacturers of ram expansions above 1M to skip the first 64k, but doable. As himem/HMA weren’t “invented” at the time, nothing existing would had been lost.
MiaM says:

April 14, 2018 at 4:03 am

Yuhong Bao and dosfan:

Intel probably didn’t anticipate the succes of the PC and the diminishing sales of non-PC-compatible x86 computers. If they had, they would had made the internal interrupt controller, dma controller and timers of the 80188/80186 compatible to a PC. It wouldn’t had cost much more (if anything at all) compared to how it’s actually done in 80188/80186, and it would probably had been a huge success. With the right price, it would probably had outsold the original 8088/8086 processors as soon as existing designs were replaced by updated versions (for example the switch from 16/32/48/64k PC motherboards to the 64/128/192/256k motherboard, if that occured after 80188 were released).

Compare with how fast dedicated chipsets replaced large amounts of discrete logic.

Yes, I know this is a sidetrack discussion, but it’s IMHO a good indicator of how Intel did view the personal computer market at that time.
dosfan says:

April 14, 2018 at 4:58 am

As far as I known Paterson’s 86-DOS ASM and TRANS were never actually used in PC DOS/MS-DOS so those wouldn’t have been relevant. Once DOS 2.0 came out in 1983, software quickly ditched the use of FCBs in favor of the newer file handle API which supported subdirectories so by 1990 when DOS 5 was being developed any DOS 1.x software would have been obsolete. Even DOS 1.x software like VisiCalc and WordStar 3.20 used INT 21h so Microsoft could have safely gotten rid of CALL 5. Who knows why they bothered to fix CALL 5.

Sean McDonough: Intel has said that by 2020, UEFI will no longer support the CSM layer which provides legacy BIOS support so those systems will not be able to run legacy operating systems (e.g. DOS, OS/2, 32-bit Windows, 32-bit Linux).
Michal Necasek says:

April 14, 2018 at 10:25 am

Yes — but as others say, that was impossible given the timeline. The 186/286 was undoubtedly mostly designed before the IBM PC even came out, and the 186/286 CPUs were available at most only a few months after the PC became available in quantity. There was no chance to design the 286 with PC and DOS in mind.

The Intel 386 was the first CPU where DOS compatibility played a role in the design (V86 mode). I think keeping backwards compatibility with existing binary software was a new concept to Intel at that point.
Michal Necasek says:

April 14, 2018 at 10:29 am

DOS 5 does not switch the A20 gate all the time. The basic approach is to switch when a new program starts and then leave A20 enabled. There’s pretty complicated logic in DOS 5/6 dealing with that, detecting EXEPACK and such. And of course if EMM386 is loaded then the actual A20 gate is never touched at all at runtime.
Michal Necasek says:

April 14, 2018 at 10:30 am

There are already systems out there that have only EFI and no CSM. Only an EFI-capable OS can be booted.
Michal Necasek says:

April 14, 2018 at 10:33 am

Again “at the time” was when the personal computer market did not exist for all intents and purposes. I think you’re absolutely right that a PC-compatible 80188/80186 would have been a hit if it had been PC compatible, but it came out at just the wrong time for that.

It is also known that until late in the 386 development cycle (1984-1985), Intel didn’t view the x86 line as all that interesting or important.
Michal Necasek says:

April 14, 2018 at 10:35 am

Why Microsoft bothered to touch CALL 5 in the first place is a good question. Maybe someone just thought they’d straighten out some gnarly old code.

Why they bothered to fix it is clear — because it broke WordStar 3.2x/3.3x, which does not use CALL 5 but does use the word at offset 6.
zeurkous says:

April 14, 2018 at 11:39 am

Yeah, well, IME there’s no good reason to have an IBM pee-cee derivative
that isn’t at all IBM pee-cee compatible anymore. They’ve always been
kludgy, but then throwing out the one thing that made us put up with
it… fail, just fail.

And then there’s Intel wanking out that ‘efi’ crap when openboot already
existed and does the job well…
ender says:

April 14, 2018 at 1:10 pm

I’ve seen a bunch of tablets that have UEFI without CSM (and more annoyingly, some have 32-bit UEFI despite having x64 CPU, so you can’t install x64 Windows on them, since it only allows booting in the same bitness as the firmware, unlike Linux).
zeurkous says:

April 14, 2018 at 1:30 pm

Again, what’s the point of keeping all the cruft around if compat is
broken anyway?

Or *is* Intel finally getting rid of i86?
MTA says:

April 14, 2018 at 5:07 pm

Thanks for a good post! I am always fascinated by A20, must be one of the most long-loved hacks.

Regarding this section I have question:

“Many years and many millions of dollars later, here we are. DOS has been sufficiently eradicated that some recent Intel processors no longer provide the A20M# pin functionality. Some.”

Did Intel really drop A20 compatibility totally from any CPU? I have seen some rumors about this but it appeared it was merely the physical pin that was dropped but that the A20 signal was just carried over a different protocol from chipset to CPU. I have also seen mention in Intel chipset documentation suggesting A20 might not be supported but I havent heard about a CPU that verifiably doesnt implement it, but it might definitely be the case.

If they do drop A20, I would concur with zeurkous that they should then go much further. If A20 is dropped in response to the fact that old operating systems cant boot anyway (see footnote *) then why not take it further and completely eliminate real mode and non-long protected mode.

It is interesting to ponder how this would go in detail but I imagine something like this: The CPU would wake up directly in x86-64 mode except that paging would be disabled, so memory would be “identity-mapped”. Many bits in CR0-CR4 and MSRs would from boot be in their “modern” values, and fixed at that (for instance concerning floating-point support, SSE, enablement of long mode etc.). You could go two ways either eliminate the GDT entirely (and only keep some support for FS/GS through control registers, all other segment registers will act as if they have base 0 and infinite limite) or you could keep the GDT as today, in that case the BIOS, as one of its first actions, would need to setup a GDT. Also, the BIOS would setup an IDT. The BIOS likely wouldnt setup paging, that just would be left to the operating system which would maybe simply set the “PG” flag (as maybe one of the only required CR0 manipulations) after having configured the paging structures.
So we have already removed a lot…

However, there are still many details to consider… how about the 32-bit compatibility sub-mode of long mode? Does it need to stay or should it just be emulated? I would argue it has to stay because emulation adds a lot of overhead. And it isnt too complicated system-architecture wise. All 16-bit stuff is gone, only 32-bit (at the instruction set level, not system architecture) and pure 64-bit is left.

But how about virtualization extensions then? Should they support real-mode, v86 mode etc. etc.? I would argue not because otherwise all that cruft will effectively stay in the CPU. For those situations where it is relevant to eliminate all that, one could rely on pure software emulation (binary translation). So Intel should essentially remove the “unrestricted guest” feature from its virtualization extensions and only allow virtualization of the now greatly simplified supported status.

What would be saved all this? Probably not that many transistors, given how few transistors those features could be implemented with to begin with. But it could greatly simplify the microcode, the external documentation, the design of bootloaders and operating systems (who currently have to, on every boot, essentially time-travel through all the old x86 modes before they reach 64-bit). Internally in Intel (and of course AMD) it would greatly simplify design. The microcode related to all this could be thrown out. All the verification related to the old stuff could be thrown out and all the in-house knowledge about these relics could disappear. At the true hardware level, maybe simplifications would open up due to the need for some special cases disappearing which could optimize some things (say the cache implementations dealing with the various modes just as one thing, but maybe also things in the pipeline itself). Internal documentation would be simplified. Onboarding of new hires, who I guess have to learn about the old stuff (at least some of them!) would be simplified allowing for more concentration of new features.

Oh, and in the same move, Intel could remove some of the old instructions no longer required and let go of the old x87 FPU and MMX support and maybe the parts of SSEx that proved useless.

With the complexity of all the “old” backwards compatibility gone Intel could now add a new submode of x86-64 going even further, maybe with new instruction set encodings and other “fixes” etc. The addition of new instruction sets have not been done in a terrible efficienct way. A modern operating system could, maybe on a “per task” basis flip a bit to indicate if it is running new x86-64 or the old-style.

What are your thoughts and how do you see the evolution of x86-64 if absolute compatibility with very old software is no longer required (but we still wish to keep it almost the same for existing operating systems and software)?

*) Of course a possible problem is that even modern operating systems expect to boot in real mode. But for UEFI compliant systems that shouldnt be an issue. Also, it seems new CPUs are only supported on most recent version of Windows anyway so that shouldnt a problem. And for Linux, the end-user can just download a new version.
MiaM says:

April 14, 2018 at 7:02 pm

zeurkous: Keeping the possibility to run 32-bit x86 code is a good thing because then you can run all old 32-bit win32 apps without emulation.

Keeping bios support for antique OS:es and keeping support for 16-bit x86 mode makes less sense today.
zeurkous says:

April 14, 2018 at 7:25 pm

Right, until they break that, too, as they did w/ original windoze
programs.

Now that even Intel has dropped the basic pr{e,o}mise of compat, nothing
IBM pee-cee-related is safe anymore.

But that’s okay. It’s time for the whole mess to go. It’s not like there
aren’t any alternatives.
raijinkai says:

April 14, 2018 at 11:16 pm

@zeurkous:
Because UEFI isn’t just UEFI, but also ACPI, the security (TE) component, and SMM. IHVs need these components to implement resource enumeration, CPU and NUMA configuration, OS aware power management, implement smart hardware features and stuff which required before fixed microcontrollers, ASICs and OS drivers as ACPI machine code.
UEFI is just the firmware framework which ties better these components and offers an uniform interface (ACPI) to control all them, offering at same time an OpenSource implementation for the firmware core (edk2/tianocore), and MS blessing (compatibility with windows driver model).
zeurkous says:

April 15, 2018 at 7:34 am

Ugh, ACPI. That never did me any good and meseriously doubts it will
magically start doing me good now.

And mestill doesn’t see what ‘efi’ does that openboot doesn’t do 1337
times better.

And beware, ‘open source’ can (and too often does) still mean ‘makes
your eyes bleed when you read it’… and m$’ blessing is IMNSHO the last
thing one should be fishing for.

Overall, mefeels that including (and mandating!) complex firmware is a
mistake… but if one does, use openboot. It does everything more or
less right.
raijinkai says:

April 15, 2018 at 3:58 pm

@zeurkous
As the need of complex, but price accessible hardware, with strict power consumption needs arises, the need of complex firmware increases too. It makes possible to implement part of the hardware as software and makes the OS not only aware about the specific features of each board, but also participate in the hardware chain, getting an integrated, and more power friendly result. Ofc you can get similar results with fixed purpose ASICs and hard burned firmware independent micro-controllers, but then the price increases a magnitude for each feature. That isn’t what people wants, and ofc investors don’t want less sales due increased prices too.

And yeh, is true you can get a similar result as UEFI with OpenBoot/RTAS/FDT combo, but there was a few importante reasons why it didn’t catched among IHVs: Too *nix minded, which made it complex to use it in tandem with DOS minded operating systems, and required things like ARC firmware emulators for each board architecture in NT Systems/WDM… But the main reason was only one… It required and still requires Forth to do anything advanced with it. No one, specially firmware programmers, ever want to learn Forth only to program a few firmware modules. Apple learned about this, and migrated fast from OFW to EFI in their x86 offerings.

Also, there is the fact that the best Openfirmware implementations out there, Firmworks OpenFirmware (which are also the resposables of the VEENER ARC emulation layer over OFW for NT), and Sun OpenBoot, weren’t Opensource, down to just a few years ago. Up to that time, IHVs geared their production toolchain and procedured on top of BIOS/ACPI/SMM, with UEFI offering to them the best migration path for its existing toolchain. No one will ever change their whole production machinery in order to just offer something more l33t… And since firmware is a piece supplied and burned by the IHV in each product offered, at least for workstations we will get UEFI in the long foreseeable future… And probably many smartphone markets with also migrate to it with the advent of the Qualcomm 8xx SoC series (which make extensive use of UEFI and ACPI), at least for their high models and Win10 ARM64 ultrabooks.
random lurker says:

April 15, 2018 at 7:01 pm

Since we’re on the subject of “what the f*ck, firmware?”, I cannot help but mention the proliferation of strange “embedded controllers” in laptops when it has for several years now been more than possible enough to implement their functionality completely with SMM/UEFI.

There’s a nice presentation by two guys breaking into the EC of an old Toshiba laptop and it’s frankly just weird. https://recon.cx/2018/brussels/resources/slides/RECON-BRX-2018-Hacking-Toshiba-Laptops.pdf
Chris M. says:

April 15, 2018 at 8:12 pm

…and to take things full circle, they document when the A20 gate is enabled by the BIOS. 😛
zeurkous says:

April 15, 2018 at 9:15 pm

@raijinkai: One might argue that if one needs significantly more complex
stuff than, say, a PROM with some boot code and a couple of sense
switches, one has already gone overboard with the general design of the
machine.

As for openboot, Intel could’ve simply licensed it for the time
being. It’s not like they don’t have enough money. Hell, they could’ve
prolly bought the copyright and released it into the public domain!

Me’d suspect that the choice for a NIH equivalent has little to do with
technical, financial, or legal reasons, but everything with the lack of
OOB consideration that your typical pee-cee type is well-known for.
raijinkai says:

April 15, 2018 at 10:28 pm

@zeurkous.
“As for openboot, Intel could’ve simply licensed it for the time being.”
First: At time Intel started their OnNow firmware initiative, which became EFI and after UEFI, licensing OpenBoot would have mean entering in agreement with Sun Microsystems, who were pushing their SPARC systems to compete with Xeon x86, Alpha, HPPA/Itanium and POWER systems. Licensing OpenFimware would have mean entering in agreement with Firmworks, which was poshing their RISC PowerPC initiative backed by Apple to compete directly with x86 workstations at prosumer market. No way Intel would have done neither of that.

Second: Again… One of the objections against OpenFirmware implementations was the Forth requirement to do anything complex with it. Never were enough Forth programmers out there, and no firmware developer wanted to learn a new programming language and build supplementary tools just to be able to program firmware modules… Cetainly true in PC compatibles world, were almost all firmware development was geared towards C and ASM.

Third: OFW never provided a easy way to migrate existing firmware development. Each vendor was responsable about providing such mechanism, and some nice developments as Firmworks ARC VENEER or many years late Codegen’s BIOS Emulation interface, never were part of the standard distribution. While UEFI also doesn’t offer a standard CSM implementation, at least offers the interfaces and the API to implement one in a standard way, and eased code debug and migration tools to existing BIOS developers, in order to facilitate CSM building and integration with UEFI from their existing BIOS codebases.

“One might argue that if one needs significantly more complex
stuff than, say, a PROM with some boot code and a couple of sense
switches, one has already gone overboard with the general design of the
machine.”
Which actually is…. Well, everything. At time when UEFI came to x86 world in the form of TianoCore project, machines with generally referred as “simpler and ASM geared” BIOS driven designs already had 4 and 8MBit sized BIOS images. In this sense, actually when them came, UEFI/CSM firmware images were smaller, and 4MBit images were the rule for long time, while offering better firmware development and usability capabilities than the rusting ROM BIOS based firmware distributions.
zeurkous says:

April 16, 2018 at 12:45 am

Oh, getting in bed w/ Sun would have humiliated Intel, that’s indeed
a Big Thing(TM)… but still Intel’s own damn fault. They kept pushing
i86 despite the fact that its major flaws have long been very very
obvious. Ironically, early Suns IIRC used an Intel-designed bus.

FORTH will never be me favourite language, but from what megathers, it’s
exceedingly simple to learn. Nothing quite as complex as, say, i86
assembler (shudder).

The fact that Intel would need to put some work in adapting openboot
seems obvious to me — about as obvious as the fact that it wouldn’t be
outright prohibitive.

As for the BIOS bloat, well, Intel’s solution was to make it worse. All
that’s really needed is to talk on one cereal (excuse the pun) port and
fetch a boot image over another. That doesn’t take one megabit, let
alone 4 or 8. Oh, and if a local boot is desired, the root fs can be on
memory-mapped flash. It’s not hard, people. Me milk-soaked morning
cornflakes are harder.
Richard Wells says:

April 16, 2018 at 1:29 am

I incorrectly remembered what software changed the A20 on every access. It was the DOS LAN Manager redirector as described at https://blogs.msdn.microsoft.com/larryosterman/2004/11/08/how-did-we-make-the-dos-redirector-take-up-only-256-bytes-of-memory/

All that is left is to find the program that used “Call 5” to cleverly avoid having to manage the stack.
Yuhong Bao says:

April 16, 2018 at 1:43 am

http://archive.computerhistory.org/resources/text/Oral_History/Intel_386_Design_and_Dev/102702019.05.01.acc.pdf

This is fun reading, especially about the 80386 bus
Michal Necasek says:

April 17, 2018 at 10:52 am

I’ve seen severs that are EFI only (possibly 64-bit only, not sure), no BIOS and no CSM.
Michal Necasek says:

April 17, 2018 at 11:02 am

A protected-mode only x86 CPU would not be exactly a new thing… see the Intel 80376.

A 64-bit only mainstream x86 CPU would fail miserably in today’s environment because there’s a lot of 32-bit software out there. That means you can’t rid of most of the complexity. Real mode and 16-bit code are probably comparatively minor nuisances. I’m also not at all certain just how much throwing out a few old instructions would simplify the CPU design. My guess is “not much”. The real complexity is in paging, caching, speculative execution, etc.

I think we will find out in a few years how well a non-legacy-encumbered 64-bit CPU can do in the market, but it’s going to be ARM.
Michal Necasek says:

April 17, 2018 at 11:17 am

If you listen to some people, IHVs should leave software alone, stick to hardware, and let the software guys (OS people) write software. I’m not sure anyone has been able to adequately explain why insanely complex, opaque, insecure, and rapidly obsoleted firmware is something everyone absolutely has to have. But certainly that’s the way things are done.

Also, ACPI isn’t UEFI, we all know that ACPI pre-dates UEFI and was developed independently for many years. I understand the value of ACPI to IHVs but it’s as much part of EFI as it was part of BIOS until recently. (That is to say, pretty independent of it.)
Michal Necasek says:

April 17, 2018 at 11:32 am

Eyes bleed? Have you been trying to read EFI source code? Actually that doesn’t make my eyes bleed, but it really makes my brain hurt.

I think in general hardware people are no worse at writing software than software people are at designing hardware, only the latter pretty much never happens in practice.
Michal Necasek says:

April 17, 2018 at 11:38 am

Thanks for the bit of history. As always, there are solid reasons for why we have what we have, much like the A20 gate. It may not make much sense now, but the decisions that were made at the time were by no means crazy.

As someone who has written a good chunk of highly compatible BIOS code in C, I have to say that I never understood why BIOS developers insisted on writing everything in assembler (and sometimes truly terrible assembler at that). Inertia I guess.
Michal Necasek says:

April 17, 2018 at 11:44 am

Yes. Explicitly mentions how it was only when designing the 386 that Intel realized the value of compatibility with existing shrink-wrapped software. When the 286 was done, Intel didn’t think much of it, and in any case the 186/286 was designed in a world with no IBM PC in it.
Michal Necasek says:

April 17, 2018 at 11:53 am

I remember reading that years ago — I guess that code was written in the late 1980s, before DOS 5. Makes sense they’d have to switch A20 all the time, I imagine the requests might have been coming from the network asynchronously, so always switching was the only safe thing to do.

The trouble with CALL 5 applications is that normally they just work… and to find one requires somehow putting a breakpoint on the entry point from CALL 5 in the DOS function dispatcher. I suppose I should try that sometime and see if it hits with anything besides my own test utility.
dosfan says:

April 17, 2018 at 7:47 pm

Since the original IBM BIOS was written in assembly language everyone else did the same plus it was a while before it was possible to write an interrupt handler in C. Well written assembly language code is always more efficient than any high-level language. This was certainly true in regards to early C compilers. As for some BIOS codebases being messy like PhoenixBIOS, it accumulated years of features and support for various chipsets without any cleanup efforts.
MiaM says:

April 17, 2018 at 11:00 pm

I assume x86 32-bit compatibility will still be kept for a while, but the performance of running 32-bit will be of less importance as times go by. (I assume that this kind of developement might even had started a few years ago).

The question is when we reach the point where 32-bit software is anyway so old that the OS might aswell emulate a 32-bit x86 CPU?
Yuhong Bao says:

April 18, 2018 at 6:12 am

Which also reminds me of DOS. If you look at MS-DOS 7.1 code you will see a lot of CLI/STI instructions where 386 32-bit instructions are being used. I suspect that FreeDOS probably don’t bother, right?
Yuhong Bao says:

April 18, 2018 at 6:17 am

There is a story about this kind of bug that is a fun read:
https://groups.google.com/d/msg/comp.sys.ibm.ps2.hardware/bsKOx9KG7Qk/pbDqkAexfEkJ
Yuhong Bao says:

April 18, 2018 at 8:34 am

Interestingly enough, PC DOS 7.1 don’t use 386 instructions in the FAT32 implementation.
Yuhong Bao says:

April 18, 2018 at 8:37 am

I wonder why PC DOS 7.1 did not base its FAT32 implementation on MS-DOS, BTW.
dosfan says:

April 18, 2018 at 7:07 pm

It is unknown why IBM did PC DOS 7.1 in the first place. My guess is for Symantec’s Norton Ghost. Anyway IBM didn’t have access to Microsoft codebase at that point. The last version they got was MS-DOS 6.0 which was used to make PC DOS 6.1.
Richard Wells says:

April 18, 2018 at 8:09 pm

The IBM product for which DOS 7.1 was supplied was ServerGuide Scripting Toolkit which needed FAT32 as part of the effort to install Windows 2000 across the network. The last DOS based version was released in 2008.
Michal Necasek says:

April 18, 2018 at 9:57 pm

Certainly, when the PC BIOS was first written C was not an option (not even remotely). Only around 1990 did the compilers get good enough.

And yes, the messy code was a result of poor/insufficient maintenance, the original code was usually very neat and clean but over time turned into a pile of rotting spaghetti.