How Not to Erase Data

From past blog posts it is fairly obvious that the OS/2 Museum occasionally purchases used hard disks. Most of the time, the disks are either completely erased (overwritten with zeros) or don’t have anything very interesting on them.

But sometimes they do. Especially old SCSI disks tend to have old data on them, simply because very few people have the necessary equipment to use them anymore. But recently I encountered a case that appears to be an epic failure to delete data.

The hard disk is a Maxtor 25128AT, a 2.5″ IDE drive with 128 MB capacity that started its life as an OEM drive in an IBM ThinkPad. It was originally set up with MS-DOS 6.20 and Windows 3.11 for Workgroups.

A 128 MB hard disk formerly found in a ThinkPad laptop

At some point in the past, someone attempted to wipe the disk by high-level formatting it, i.e. putting in place a fresh FAT file system. That normally does quite a lot of damage by destroying the root directory and FAT tables. If the file system was fragmented, recovering the files can be quite difficult and very labor intensive. Yet on this particular drive, reformatting the drive resulted in no appreciable data loss and recovering all files was fairly easy. How is that possible?

The drive was originally compressed with DoubleSpace. When a drive is compressed with DoubleSpace, it becomes a “host drive” and will contain only a few files: IO.SYS, MSDOS.SYS, DBLSPACE.BIN, and DBLSPACE.000. The latter is usually a very large file that takes up almost the entire disk. Within DBLSPACE.000, there is another, FAT-like but quite non-standard file system, which contains the compressed drive contents.

When the drive showed up, I made an image of it. That’s standard procedure for getting a sense of what shape the drive is in (unsurprisingly, the health of used drives can be all over the place). Of course I immediately saw that although the drive is freshly formatted and has almost no files on it, it’s far from empty.

Somewhere within the first few thousand sectors, I noticed a strange looking boot sector with an “MSDSP6.0” OEM signature, unfamiliar to me. Punching that string into a search engine resulted in exactly one hit—that almost never happens. It was a source comment in the Spanish language version of Undocumented DOS. Fortunately I have the English language book, as well as the source code that came with it.

There I quickly established that the source code in the book dumps information about a compressed DoubleSpace volume, and that it’s based on a utility called DSDUMP and published by Microsoft, not truly undocumented. And I was able to find the original Microsoft utility, too.

With the DSDUMP utility in hand, I was able to determine that the reformatted disk contained the entire and untouched compressed volume file (CVF). The next step took me a few tries to get right.

I found out that the compressed disk was not created with MS-DOS 6.22, because that used DriveSpace and not DoubleSpace. I also found out that while MS-DOS 6.0 did use DoubleSpace, it did not like the CVF from the Maxtor drive, claiming it is too new. Finally MS-DOS 6.2 turned out to be the right pick.

What I did next was run DoubleSpace to compress the empty disk, and then copied over the DBLSPACE.000 file that I’d recovered from the original drive. This produced a fully functioning file system complete with Windows 3.11 for Workgroups and several applications, such as WordPerfect 6.0 for DOS or IBM Legato (later known as IBM Works) for Windows.

As far as I can tell, the only thing formatting the drive managed to destroy was the Windows 3.11 permanent swap file, which most likely wouldn’t have been all that interesting anyway. Everything else was still there within the CVF.

Moral of the story: If you want to erase data, do it right!

This entry was posted in DOS, PC hardware, Storage. Bookmark the permalink.

17 Responses to How Not to Erase Data

  1. Bert says:

    Reminds me of a dos based disk wiping product that a local recycler was using… Due to limitations in the software it couldn’t address more than the first 2GB of the drive, so once drives larger than 2GB were common, they were leaving a lot of data still accessible.

    Even more amusing, when i demonstrated them a tool (i believe dban) which would wipe the entire drive, they complained that it was too slow, which is hardly a surprise considering its doing 20x as much work on a 40gb drive.

  2. Chris M. says:

    Did Microsoft ever release any code for mounting a DoubleSpace/DriveSpace volume under a NT OS? Even if it was just user-space, it would be really handy for data recovery on older machines that I’ve run across.

    Linux had some tools, like DMSDOS that’s likely code rotted and broken under anything modern: https://cmp.felk.cvut.cz/~pisa/dmsdos/

  3. Yuhong Bao says:

    I believe they tried for NT 3.5, but the Stac lawsuit stopped them from doing it.

  4. Michal Necasek says:

    It does sound like they had it ready for NT 3.5 but couldn’t release DoubleSpace support because of the lawsuit. And after everything settled, somehow they never revisited the issue.

    Just one of the many signs that the NT people lived in a world of their own and were not particularly keen on cooperating with the vast DOS/Win9x majority.

  5. zeurkous says:

    Unlike OS/2 folks, you mean?

  6. Michal Necasek says:

    Well, Microsoft was only going to cooperate so far with IBM… but what excuse did Microsoft have for not working together with Microsoft?

  7. zeurkous says:

    Heh.

    The answer is, of course, that there’s no single $LARGE_ORGANIZATION.

    They all have little islands.

  8. Richard Wells says:

    Drive compression requires very complex repair procedures. That would be a lot of work to support old, slow, limited capacity drives. Given the difference in NT use patterns (i.e. more large files) and the increased development of compressed files, the likelihood is high that some writes will fail because the actually available free space is less than the reported free space on a compressed drive. Support calls because of lost data was contrary to the design goals of NT.

  9. zeurkous says:

    The obvious solution to that is to always report the real amount of
    free space, and treat everything that becomes available due to
    later compression as a bonus.

    But yeah, then they implemented compressed files.

  10. Chris M. says:

    The fracture of the Win32 API with the release of Windows 95 (Win32c?) made NT people acutely aware of their little world. They had tons of new software that didn’t work quite right due to UI differences and missing API functions. NT 4.0 solved a lot of that, but was still missing things like DirectX.

    As for disk compression, NTFS integrated native per-file compression that worked decently enough.

  11. Michal Necasek says:

    Yes, I remember the era… and oftentimes the Win32 API differences between Win9x and NT seemed just incomprehensible. There were fundamental differences that maybe had some justification (Unicode vs. not-Unicode), and there were others that just didn’t make any sense. It was pretty clear that there were two Win32 implementations managed by teams that didn’t like each other very much at all.

  12. zeurkous says:

    There was also Win32s, which came with its main application: FreeCell.

    A victory for compatibility, for sure.

  13. Malcolm says:

    I don’t know the real reason for not supporting DoubleSpace/DriveSpace on NT, but thinking about it for a moment shows it’s very different to DOS.

    As this article mentions, DOS requires enough uncompressed data to be able to boot itself, then load the DoubleSpace TSR.

    Implementing DoubleSpace as an NT driver implies keeping the kernel and drivers in uncompressed form, loading the kernel, then allowing the kernel to decompress enough to load the rest of usermode. But that implies splitting \WINNT in half between the boot start pieces and later pieces, which are stored on different volumes. Note further that the thing telling the loader which drivers to start is the SYSTEM registry hive, which needs to be accessible to the bootloader, but presumably the intention is to compress the rest.

    A better but harder approach is to modify NTLDR to be able to load the kernel and drivers, including DoubleSpace, from a DoubleSpace compressed volume. From there the kernel can continue operating. But that’s a major chunk of code in NTLDR. This is effectively what happens in Windows 95, although that’s because it’s leveraging the original DriveSpace TSR as a bootloader, and building a new 32 bit driver as a performance optimization.

    The same basic problem appeared much later and ended up redesigning the boot system, so we now have BootMgr (system wide), which loads WinLoad (per OS image) which loads the kernel. BootMgr and WinLoad are built from the same sources and are much more modern creatures; today, they support things like Bitlocker, and WinLoad has support for WIM boot in various forms, which is not that different to DoubleSpace. With Bitlocker we even have a mini-partition that’s used to load the bootloader, which in turn can access the system from a different encrypted volume. But it took a complete redesign of the boot stack to get there.

    It may be true that the two groups hated each other, but I think it’s more true to say they didn’t understand each other. DoubleSpace made sense as a TSR, but NT didn’t have a DOS/TSR boot environment, and since NT would have had a lot more pieces that can’t or shouldn’t be compressed, using per-file compression made more sense.

  14. zeurkous says:

    Don’t forget that DS *does* offer another option: create a CVF and
    mount it, without messing with the rest of the hosting file system.

    The “boot through a CVF” option is a hack that could indeed be quite
    involved to carry over. But handling of DS volumes could’ve otherwise
    been supported (through an IFS driver, for example).

  15. CHS Hater says:

    Sort of off-topic… Are there any tricks to reading these old CHS IDE drives? I’ve been able to read most anything with early 2000s system running either DOS or Linux. But the CHS drives (all old laptops of DriveSpace vintage) never can be read. Is the lack of BIOS drive types in newer systems?

  16. Michal Necasek says:

    The trick I use is an old 486 machine. Although the vast majority of these drives can be accessed in Linux on a modern machine, as long as you have an IDE controller.

    It is possible that the BIOS in some newer machines may not support CHS addressing anymore, only LBA.

  17. CHS Hater have you tried to use Aaru?

    As long as Linux gives it a /dev/sdX node it should work.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.