The 8086/8088 is a 16-bit processor and offsets within a 64K segment always wrap around. If a one-byte instruction at offset FFFFh is executed on an 8086, execution will continue at offset 0. This is simply a consequence of the Instruction Pointer (IP) being a 16-bit register.
Funny things happen when an access crosses a segment boundary. On an 8086, it will also wrap around; accessing a word at offset FFFFh will access one byte at offset FFFFh and one byte at offset 0 in a segment. Again, that is a consequence of 16-bit address calculations.
The 80286 got a lot smarter about this. Segment protection prevents accesses that wrap around the end of a segment, for both data and instructions. The 80386 continued using the same logic.
The 286 and 386 support one special case, stack wraparound. When the 16-bit Stack Pointer (SP) is zero, pushing (say) a word on the stack will wrap around and the new SP will be FFFEh. This feature was required for 8086 compatibility, because a full size 64K stack needs to start with SP=0 (the pushes and pops must be aligned for the wraparound to occur; unaligned accesses will cause protection faults).
Does the instruction pointer also wrap around in a way similar to the stack segment?
Let’s consider the following simple DOS program:
.model small .code mov dx, offset msg_bot mov ah, 9 int 21h mov ax, 4C00h int 21h _start: mov ax, _DATA mov ds, ax mov dx, offset msg_str mov ah, 9 int 21h jmp near_end org 0FFF8h near_end: mov dx, offset msg_top mov ah, 9 int 21h inc ax .data msg_bot db 'Wrapped around to start of segment',13,10,'$' msg_top db 'Near top of code segment',13,10,'$' msg_str db 'Entered program',13,10,'$' .stack end _start
The program is constructed such that the one-byte ‘inc ax’ instruction is at offset FFFFh in the code segment.
When executed on a typical PC compatible system, the program will print the following:
C:\>wrap Entered program Near top of code segment Wrapped around to start of segment C:\>
Clearly the instruction pointer wrapped around 64K. Case closed.
But wait! Not so fast. Although it looks like the IP wrapped around, what actually happened is a bit more complicated, and much more interesting.
After executing ‘inc ax’ on a 386 compatible CPU, the EIP instruction pointer will not wrap to zero but rather advance to 10000h. This will trigger a #GP (General Protection) fault when attempting to execute the next instruction (of course, given that 10000h is past the 64K segment limit).
The #GP fault vector is 13 (0Dh). But in a PC compatible system, that is also the vector for hardware interrupt IRQ5. If there is nothing using IRQ5, the default BIOS handler will examine the interrupt controller state, decide that nothing happened, and execute IRET. Even if some peripheral is using IRQ5, the interrupt handler will eventually return with an IRET instruction.
And that’s where the the trick is. When the #GP fault occurs in real mode, the CPU can only push a 16-bit code offset on the stack. Instead of 10000h, it pushes zero. When the interrupt handler returns, it will continue executing at address zero instead of returning where it truly started (offset 10000h).
In protected mode, the behavior is a bit more obvious; assuming that 32-bit interrupt handlers are used, the CPU will push the full 32-bit EIP value on the stack. An IRET instruction will not be able to return because it will #GP fault trying to transfer control to an offset past the segment limit.
The same DOS program shown above does not successfully run in an OS/2 VDM. That is a strong hint that DOS applications do not rely on such wraparound, because it would be relatively easy for OS/2 to support that.
Protected-mode 16-bit programs usually will terminate with some form of protection fault if they try to execute past 64K. it is only the PC compatible DOS environment where the wraparound seemingly occurs, due to a combination of interrupts losing the high half of EIP and the #GP fault being aliased to a hardware interrupt.
Needless to say, 16-bit code segments on a 386 can have any segment limit, up to 4GB. No tool that I know of supports oversized 16-bit code segments (normal 16-bit near jumps and calls can only generate 16-bit offsets, but it is possible to produce 32-bit offsets in 16-bit code). The utility of such segments is extremely problematic in real mode, because every interrupt will lose the high word of EIP. In the end, it’s much more straightforward to use proper 32-bit code segments or at least multiple 16-bit code segments.