Geometry Problems

When introducing hard disk support in the PC/XT back in early 1983, IBM made a very unfortunate design decision: the information about drive geometry was exposed in the BIOS, and even worse, in the boot sector stored on the disk.

SCSI storage devices always used logical block numbers to access data, and drive geometry was something internal to a disk drive, not visible to the user. But SCSI was not yet fully established in the early 1980s and IBM was looking for a cheap technology. Therefore, simple MFM hard disks and controllers were used in both the IBM PC/XT and later the PC/AT. On the hardware level, both addressed disk sectors in terms of cylinders, heads, and sectors (CHS).

Ironically, the DOS FAT file systems never cared about geometry and only used logical sector (or cluster) numbers, with a low-level BIOS driver translating from logical sector numbers to whatever addressing scheme the underlying storage required.

As an aside, the SCSI protocol was a natural fit for USB mass storage devices which are typically implemented either as disks or flash storage. For modern disks, actual drive geometry is usually quite complex and hidden, and for flash storage, there is none.

The original BIOS disk interface (INT 13h) only supported CHS addressing. Around 1995, an extended BIOS interface was created, with support for Logical Block Addressing (LBA) which was not geometry dependent.

Unfortunately, geometry information was also stored on the disk itself, in the partition table that was part of the master boot record (MBR), as well as any extended partition tables (if present). The MBR was never defined by the BIOS—the bootstrap code in the BIOS simply loaded the MBR when a valid signature was found, and executed whatever code was in the MBR. The partition table format was initially defined by DOS, but soon every PC OS understood it.

That turned out to be a very bad thing. Because the BIOS provided no interface to access the partition information, each OS had to read, parse, and potentially write the partition information on its own. The PC industry never managed to modernize the MBR because all the existing disks with existing MBRs had to keep working. As of 2011, the MBR, designed nearly 30 years ago, is still in widespread use. A replacement called GPT exists—the GUID Partition Table defined by EFI—which is only very slowly gaining acceptance.

When IDE drives first appeared in the late 1980s, the true disk geometry was effectively hidden (just like in SCSI disks) and the controller supported a new mode called LBA, or Logical Block Addressing. Instead of using cylinder/head/sector triplets, a single block (sector) number was used. Unfortunately the dependencies on disk geometry were never fully eradicated, and even flash based IDE drives had to pretend that they had some number of cylinders, heads, and sectors.

The partition table stored in the MBR contained both CHS addresses and sector numbers, but in practice the two didn’t always match and the CHS address data was often considered more reliable. Old boot programs only used the original BIOS INT 13h interface and likewise needed CHS addressing.

Exposing the disk geometry in the BIOS and MBR would not be that bad if it weren’t so limiting. The BIOS interface could support 1,024 cylinders, 255 heads, and 63 sectors per track (or 10:8:6 addressing, for the numbers of bits used for the cylinder/head/sector number). Approximately 7.9 GB could be addressed through this interface. That would have been good until 1995 or so.

However, the underlying AT-compatible hardware disk interface (ATA, for AT-Attachment) had different limits: 65,536 cylinders, 16 heads, and 255 sectors per track. This 16:4:8 addressing scheme was sufficient for up to 127.5 GB, which would have been sufficient until about 2003.

Why did the different limits matter? As long as only the BIOS was used to access the disk (e.g. DOS), only the BIOS limits applied. However, when an OS directly accessed the disk through the ATA interface (OS/2, UNIX, Enhanced 386 mode drivers for Windows 3.x), the hardware limits also applied. And the common limit was only 1,024 cylinders, 16 heads, and 63 sectors per track (10:4:6 addressing), or 504 MB. That limit was reached in the early 1990s. Both limits were important because both DOS (which uses the BIOS to access disks) and protected-mode operating systems with custom drivers were in use, often both on the same system. When IDE disk capacities grew beyond 504 MB, bad things started happening.

In many cases, the system was simply incapable of using more than 504 MB of disk space. The reason was that the BIOS assumed that its own and hardware CHS addresses were identical, so the lowest common denominator applied. That was quite bad, but did not destroy data. So-called dynamic disk overlays (DDOs) were used as a stopgap solution. The MBR would load a special driver which replaced the disk driver built into the BIOS; the replacement was capable of addressing large disks. BIOSes with proper large disk support soon appeared on the market as well.

Whether large support was implemented in DDOs or in the BIOS, some sort of a translation scheme had to be used, converting between the different BIOS and ATA limits. Translation allowed IDE disks larger than 504 MB to be used, but users soon discovered that translation was potentially dangerous. If a disk was moved to another system which used an incompatible translation scheme, data loss could occur, including a complete loss if critical disk structures were inadvertently overwritten.

In the second half of the 1990s, disk translation was standardized; data loss was unlikely to occur, but disk geometries still caused trouble, even when disks were almost exclusively using LBA. Geometry information was still embedded in the MBR and any extended partition tables, as well as in the DOS boot sector. Boot loaders stored in the MBR still often used the old CHS-based BIOS interface, at least when accessing the first 1,024 cylinders (Windows XP was a notorious example).

Different operating systems had slightly different ideas and limitations, sometimes with data-destroying results. Some operating systems required partitions to be aligned on cylinder boundaries, others did not. When a fake geometry was applied to a disk, sometimes there was unused storage beyond the last full cylinder—only a minor issue in the age of multi-gigabyte disks, but still an annoyance.

Even when disk geometry was long irrelevant on the disk interface level, it could still cause headaches and in the worst cases, damage data, all a consequence of a seemingly benign design decision made decades ago.

This entry was posted in BIOS, PC architecture, Virtualization. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *