Undocumented 8086 Opcodes, Part I

This is a guest post by Raúl Gutiérrez Sanz

This multi-part document is about undocumented 8086 processor opcodes and their behavior. Most of the document will likely apply to the 8088 processor as well, but this has not been verified. It doesn’t apply to any other processor/controller, like the 80186, 80286 or newer, as they use the undocumented 8086 opcodes to implement new instructions. For the same reason, it does not apply to NEC V20/V30 processors either. And even when 8086 opcodes remain undocumented on new processors, their behavior is unlikely to be the same (not least because starting with the 80186, undefined opcodes generally raise an invalid instruction exception).

Sometimes it is not easy to determine which opcodes are documented and which ones are not, because some of them appeared or disappeared at some point from the official Intel documentation. So, while most opcodes listed in this document have never been officially documented, you may find some of them in certain Intel documents, or at least in some versions.

On the 8086, all undocumented opcodes do something, but typically not something very useful. After all, if they did something useful, they would have been documented.

This document will be split into three sections:

  • Section I—Holes In the Opcode Map
  • Section II—Holes In the Addressing Scheme
  • Section III—”Nonsense” Instruction/Operand Combinations

For some undocumented features, the categorization is admittedly arbitrary.

The Motivation

I always wondered why systems cloning the 8086 processor behavior (like emulators, FPGA implementations, and hypervisors) never claim 100% compatibility with such an old and simple (from a modern point of view) device. Then I discovered that there’s still a lot of undocumented behavior and assumed that’s one of the reasons. How to clone the behavior of something when you don’t know how it behaves? That’s one of the reasons I did the research and wrote this document, apart from personal challenge. Something, or even a lot, should have been written on this subject in the past, when 8086/8088 was the mainstream processor, but apart from a smidgen of information about several instructions (SALC and a few more) I couldn’t find any comprehensive information in books or on the Internet. If something had been written, it may not have survived.

I would like modern 8086 hardware or software implementations to make use of this document to get closer to the real processor behavior. Feel free to use my research, just include a reference to this document.

The Approach

First I made a list of all undocumented features of the 8086 CPU. When I considered it complete, I installed some very simple debuggers and an assembler in one of the newest 8086-based PCs: an Amstrad 5086 with a Siemens 8086 CPU. Then it was a matter of testing everything in the list and finding out what happened. That included much use of the debuggers, and writing some pieces of code. And a lot of switching off followed by switching on.

The problem with undocumented instructions is that you are never sure what the result depends on. Will the result be the same if I change a register, or a flag, or memory contents, or a combination of registers, flags and memory contents? You are never sure about the results either. You can notice a register changed after executing the instruction. But, did anything else change as well?

After many tries you can find that there is a logic about what you consider the input and the results. But even after testing the same instruction under hundreds of different conditions, there is much guesswork involved.

Especially difficult to analyze were instructions which jump to unexpected locations and also instructions affecting what seems to be an undocumented internal register. Debuggers obviously don’t preserve internal registers. There is no way to do it. There’s not even a way to see what it contains.

References

  • MCS-86 Assembly Language Reference Guide, October 1978 (Intel document number 9800749-1); especially the instruction set matrix in pages 20-21 which will be referred to as “The instruction matrix”
  • 8086 16-BIT HMOS MICROPROCESSOR (Intel document number 231455-005)
  • ISA System Architecture, Third Edition – Tom Shanley and Don Anderson – MindShare, Inc. (ISBN 0210409968)

Section I: Holes In the Opcode Map

A recurring theme in this section is that the processor ignores certain bits when decoding instructions. That leads to the existence of aliases where an undocumented opcode exists as a duplicate of a documented one. This is likely a side effect of the fact that the 8086 has no concept of an invalid opcode; when decoding instructions, the CPU does not have anything better to do with undocumented opcodes, so it might as well save some effort and not decode certain bits at all.

Another common case occurs when an instruction logically exists in the opcode map but performs a function which is not useful (POP CS, various shifts).

Opcode 0Fh

Documented equivalent: None
Instruction: POP CS
Notes: This instruction is mentioned in some 3rd party references; it is as obvious as it is useless.

Opcode 6Xh (60h – 6Fh)

Documented equivalent: 7Xh (70h – 7Fh)
Instruction: Jxx
60h (equivalent: 70h) = JO
61h (equivalent: 71h) = JNO
62h (equivalent: 72h) = JC
63h (equivalent: 73h) = JAE
64h (equivalent: 74h) = JE
65h (equivalent: 75h) = JNZ
66h (equivalent: 76h) = JBE
67h (equivalent: 77h) = JA
68h (equivalent: 78h) = JS
69h (equivalent: 79h) = JNS
6Ah (equivalent: 7Ah) = JPE
6Bh (equivalent: 7Bh) = JNP
6Ch (equivalent: 7Ch) = JL
6Dh (equivalent: 7Dh) = JGE
6Eh (equivalent: 7Eh) = JLE
6Fh (equivalent: 7Fh) = JG

Opcode C0h

Documented equivalent: C2h
Instruction: RET imm16 – In the instruction matrix: “RET (i+SP)”

Opcode C1h

Documented equivalent: C3h
Instruction: RET

Opcode C8h

Documented equivalent: CAh
Instruction: RETF imm16 – In the instruction matrix: “RET l, (i+SP)”

Opcode C9h

Documented equivalent: CBh
Instruction: RETF – In the instruction matrix: “RET l”

Opcode D0h xx110xxxb

Documented equivalent: None
Instruction: SETMO byte R/M
Action: Moves byte -1 (FFh) to its 8-bit operand and set flags accordingly

  • 8-bit operand <— FFh (AL for D0F0h, CL for D0F1h…)
  • Clear CF (NC)
  • Set PF (PE)
  • Clear AF (NA)
  • Clear ZF (NZ)
  • Set SF (NG)
  • Clear OF (NV)

Byte operand and flags (not sure about AF) are modified as if the following instruction was executed: OR AL, 0FFh

Unlike SALC:

  • Result does not depend on CF
  • Destination register is not always AL
  • Instruction changes flags

Notes:

  • “SETMO” stands for “SET Minus One”
  • D0h is the first byte of the byte-operand, 1-position shift instructions (in the instruction matrix: “Shift b”). The type of shift is specified in bits 3 to 5 of the second byte, but it is not documented when these bits are 110b.

Opcode D1h xx110xxxb

Documented equivalent: None
Instruction: SETMO word R/M
Action: Moves word -1 (FFFFh) to its 16-bit operand and set flags accordingly

  • 16-bit operand <— FFFFh (AX for D1F0h, CX for D1F1h…)
  • Clear CF (NC)
  • Set PF (PE)
  • Clear AF (NA)
  • Clear ZF (NZ)
  • Set SF (NG)
  • Clear OF (NV)

Word operand and flags (not sure about AF) are modified as if the following instruction was executed: OR AX, 0FFFFh

Notes:

  • “SETMO” stands for “SET Minus One”
  • D1h is the first byte of the word-operand, 1-position shift instructions (in the instruction matrix: “Shift w”). The type of shift is specified in bits 3 to 5 of the second byte, but it is not documented when these bits are 110b.

Opcode D2h xx110xxxb

Documented equivalent: None
Instruction: SETMOC byte R/M
Action: If CL != 0, changes 8-bit operand and flags like SETMO (D0h xx110xxxb instruction code), otherwise nothing changes

Notes:

  • “SETMOC” stands for “SET Minus One if CL != 0”
  • D2h is the first byte of the byte-operand, CL-positions shift instructions (in the instruction matrix: “Shift b,v”). The type of shift is specified in bits 3 to 5 of the second byte, but it is not documented when these bits are 110b.
  • The destination operand can be specified, but the zero-test is always performed on CL.

Opcode D3h xx110xxxb

Documented equivalent: None
Instruction: SETMOC word R/M
Action: If CL != 0, changes 16-bit operand and flags like SETMO (D1h xx110xxxb instruction code), otherwise nothing changes

Notes:

  • “SETMOC” stands for “SET Minus One if CL != 0”
  • D3h is the first byte of the word-operand, CL-positions shift instructions (in the instruction matrix: “Shift w,v”). The type of shift is specified in bits 3 to 5 of the second byte, but it is not documented when these bits are 110b.
  • The destination operand can be specified, but the zero-test is always performed on CL.

Opcode D6h

Documented equivalent: None
Instruction: SALC
Action: If CF set (CY), moves 0FFh to AL, otherwise (NC) moves 0 to AL
Does not modify flags
Register AL (but not flags) modified as if the following instruction were executed: SBB AL, AL

Notes:

  • “SALC” stands for “SET AL to Carry”
  • The destination operand is always AL.
  • SALC is documented in many 3rd party publications and exists in all Intel x86 CPUs
  • For years, Intel refused to document SALC even though it acknowledged that D6h is not an invalid opcode; SALC was mentioned by name for the first time in the October 2017 edition of the Intel SDM (yes, really!)

Opcode F1h

Documented equivalent: F0h
Instruction: LOCK
Notes: LOCK is technically a prefix, not an instruction

(Section I to be continued.)

About the Author

Raúl Gutiérrez Sanz has a degree in Computer Science from the University of Valladolid (Spain). He has been working as an analyst for 20 years and playing with (mostly old) games and PC hardware.

This entry was posted in 8086/8088, Intel, Undocumented. Bookmark the permalink.

13 Responses to Undocumented 8086 Opcodes, Part I

  1. Today I learned about SETMO/SETMOC – I had thought these to be aliases of SHL for some reason, but they do seem to work the same on my Intel 8088 as on your Siemens chip.

    One thing to be aware of when using SETMOC in real programs, though, is that opcodes D2 and D3 have a loop in the microcode that executes CL times, so the higher the value in CL the longer the instruction takes. So a SETMOC with CL==0xff will have the same effect as with CL==1 but will take much longer – I timed it as 1031 cycles on my 8088, or about 216 microseconds. As well as making your program slow it will also cause high interrupt latency since hardware IRQs are only serviced between instructions. Fortunately DRAM refresh DMA cycles can continue unimpeded through a long-running instruction, so there’s no danger of this instruction causing DRAM decay.

  2. Michal Necasek says:

    The current assumption is that all undocumented opcodes are the same across 8086 and 8088, as well as across Intel and all second source manufacturers (AMD, Harris, Siemens, etc.). On the other hand, any clone (NEC, C&T) or emulator is likely going to behave very very differently.

    I seem to recall that the 8087 had documented instructions in the 1000-cycle range execution time. I suppose that’s part of the reason why DRAM refresh was handled by the DMA controller in the IBM PC, independent of the CPU.

  3. Miod Vallat says:

    The “POP CS” instruction was not as useless as one can think – it was the quickest and smallest way for code in the boot sector to jump to memory after copying its image. In fact, the well-known “ping pong” virus made use of this instruction, which caused virus-infected disks to no longer be able to boot on 80286 systems; but it did not take long for an update of the virus to appear….

  4. Yuhong Bao says:

    @Andrew Jenner: Which is probably not suprising since it shares the same opcode as shifts.

  5. anon says:

    D0/D1/D2/D3 with reg=110 is SAL

    SDM -064, volume 3, section 22.15
    APM 3.25, volume 3, table A-6

  6. Raúl Gutiérrez Sanz says:

    Those manuals do not apply to the 8086. Just look at the SDM and APM titles.

  7. Michal Necasek says:

    They don’t and they do. For example the latest Intel SDM mentions the SALC instruction. Does that apply to the 8086?

  8. I found some more undocumented instructions when disassembling the microcode! https://www.reenigne.org/blog/8086-microcode-disassembled/

  9. Michal Necasek says:

    Very cool!

    INT AL is a sort of obvious omission in the instruction set. The REP MUL/DIV thing is very interesting. I wonder if it was ever used by publicly available software.

    You also have to wonder how many people did the analysis since 1978 and never published their findings.

  10. anon says:

    SETMO… wouldn’t it be simpler to just call that…

    OR Eb,-1(,1)
    OR Ew,-1(,1)
    OR Eb,-1(,CL)
    OR Ew,-1(,CL)

    …to better match the rest of group #2? That would
    capture the FLAGS behavior… and it is not clear if
    the dst operand is actually read at the beginning –
    it would take a bus analyzer to tell, wouldn’t it? 🙂

    PS: What happened to the continuation of section I,
    let alone sections II and III? Will they still happen?

  11. Anonymous says:

    Are we going to get parts II and III eventually?

  12. Michal Necasek says:

    That’s up to the original author, not me… I would say the chances are low but not zero 🙂

  13. Tor User says:

    The first CPU whose manual documents D0/6, D1/6, D2/6, and D3/6 seems to be the 486[1]. Although on page 26-253 (739 in the PDF), it says that both SAL and SHL are /4, on page A-8 (page 788 in the PDF), it says that Group2/100 is SHL and that Group2/110 is also SHL. I’m also extremely curious about whether the 80186, 80286, and 80386 would raise interrupt 6, execute SETMO, or execute SAL, but sadly I lack the hardware to test it.

    [1] The manual I saw it in is “i486 Microprocessor Programmer’s Reference Manual,” ISBN 1-55512-101-2, obtained from https://archive.org/details/bitsavers_intel80486mmersReferenceManual1990_29642780

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.