Several months ago I had a go at producing a high resolution 256-color driver for Windows 3.1. The effort was successful but is not yet complete. Along the way I re-learned many things I had forgotten, and learned several new ones. This blog entry is based on notes I made during development.
Source Code and Development Environment
I took the Video 7 (V7) 256-color SuperVGA sample driver from the Windows 3.1 DDK as the starting point. The driver is written entirely in assembler (yay!), consisting dozens of source files, over 1.5MB in total size. This driver was not an ideal starting point, but it was probably the best one available.
The first order of business was establishing a development environment. While I could have done everything in a VM, I really wanted to avoid that. Developing a display driver obviously requires many restarts of Windows and inevitably also reboots, so at least two VMs would have been needed for a sane setup.
Instead I decided to set everything up on my host system running 64-bit Windows 10. Running the original 16-bit development tools was out, but that was only a minor hurdle. The critical piece was MASM 5.NT.02, a 32-bit version of MASM 5.1 rescued from an old Windows NT SDK. The Windows 3.1 DDK source code is very heavily geared towards MASM 5.1 and converting to another assembler would have been a major effort, likely resulting in many bugs .
Fortunately MASM 5.NT.02 works just fine and assembles the source code without trouble. For the rest, I used Open Watcom 1.9 tools:
wrc (make utility, linker, and resource compiler). I used a floppy image to get the driver binary from the host system to a VM, a simpler and faster method than any sort of networking.
With everything building, the real fun started: Modifying the Video 7 driver to actually work on different “hardware”.
Trials and Tribulations
Fortunately there was not a huge amount of Video 7 specific code in the sample driver. Unfortunately the hardware specific code was sprinkled throughout the code base.
My first change was to unify the bank switching code, which is critical for performance. The sample driver had about half a dozen different bank switching routines (no, I don’t know why). I replaced them with one, and made sure the bank switching is only done when necessary (i.e. the current bank differs from the requested one).
Why not use a linear framebuffer, you ask? From the beginning, I did not want to restrict the driver to 386 Enhanced mode Windows. Using a LFB in Standard mode is difficult; it’s easy to reprogram a selector base, but that breaks down when the system runs with paging and the LFB is not mapped. Even worse, in real mode a LFB just can’t be used, period.
Then came the painstaking work of removing the V7 specific drawing code. There was more of it than I’d expected. The V7 hardware has latches and pattern blit registers that accelerate certain operations. Now, there were pure software fallback drawing paths more or less everywhere, but not at all clearly identified. It took some effort to force all drawing to go through the software path.
There was one very nasty bug related to this; the driver assumed that the hardware pattern blit registers take care of pattern rotation. Forcing pure software drawing could lead to a situation where the pattern was not correctly rotated. This caused very visible problems when dragging windows around, as the selection rectangle (a patterned line) was prone to leaving “droppings” behind.
I ditched any attempts to use offscreen memory in the driver. While that is usually a performance win on real hardware, it’s not in a VM. Due to the bank switching overhead, moving data from system memory is always faster.
The drawing code in the V7 driver assumes that a single scanline never crosses a bank boundary; that significantly simplifies the drawing logic because there’s no need to potentially switch banks between two adjacent pixels. The original driver used a 1K pitch, allowing maximum horizontal resolution of 1024 pixels. I changed that to 2K, which enables resolutions up to 2048 pixels horizontally. This could potentially be made more flexible in order to conserve video memory, but at least for the time being that didn’t seem worth the effort.
The mouse cursor drawing code had another nasty bug in it. The V7 driver would more or less always use the hardware cursor and the software fallback was probably very rarely used, if ever. Under some circumstances, the routine to save screen contents under the cursor could be entered with the direction flag set, and ended up copying data in the wrong direction and overwriting innocent memory. A single CLD in the right place fixed that.
I also had to contend with the question why the colors in my driver are different from the Windows 3.1 VGA/SVGA drivers. I learned that for whatever reason, the Windows 3.0 and 3.1 8514/A driver (the canonical high-res driver) really used a different color scheme. This was documented in the Windows 3.0 and 3.1 DDKs, although no explanation was provided as to why the colors should be different.
The linker (
wlink) caused one very interesting problem. By default,
wlink enables far call optimization, replacing far calls with near ones. This optimization is almost always safe and a performance win, but not in the case of the Windows display driver. The driver “compiles” drawing routines by copying fragments of code from the code segment on the stack, assembling a selection of them together as needed and modifying constants within the code. Now,
wlink optimized the far calls within the code segment, which would have been fine, but when that code got copied on the stack, calls to the code segment really needed to be far. Disabling the far call optimization was trivial once I knew what the problem was.
As a side project, I also wrote a quick and dirty
wmapsym tool, a functional equivalent of Microsoft’s MAPSYM but using Watcom map files as input. This proved extremely useful when debugging the driver.
Apropos debugging—the tool for Windows 3.1 driver debugging is WDEB386, a more or less standard Microsoft-ish debugger similar to SYMDEB, the OS/2 kernel debugger, NTSD, and others. I used it with input and output redirection to a serial port; this was routed to a pipe on the host, and PuTTY attached to the pipe.
Going More Retro
The display driver functionality in Windows 3.1 Standard and 386 Enhanced does not matter much. The one area where there’s a major difference is DOS session support. In 386 Enhanced mode, there’s a whole dedicated VxD (VDDVGA) that handles video virtualization.
Even when switching to a fullscreen DOS session, the display driver remains active in 386 Enhanced mode, but it is notified via
dev_to_background calls that it’s going to the background or coming back. In Standard mode, the driver is shut down via the
Disable call when switching to a full-screen DOS session, and re-initialized via
Enable on the way back.
Things started getting even more interesting with Windows 3.0. In Standard and 386 Enhanced mode, the differences from Windows 3.1 are minimal. But Windows 3.0 running in real mode is a different beast. I had to modify the driver to not use any APIs available only in protected mode and decide at runtime (using
WinFlags) what to do.
It would have been lovely if I had the Video 7 sample driver from the Windows 3.0 DDK. Alas, I never managed to find it. Anyone?
Windows 2.x was more work to get going. The basic structure of the driver is the same, there are just fewer GDI calls the driver needs to implement. For the most part, Windows 2.x is extremely similar to Windows 3.0 in real mode. The difference is that the API calls added to support protected mode (such as
AllocCSToDSAlias) do not exist in Windows 2.x at all. The drawing code is essentially identical, but the driver initialization and teardown need to be slightly different.
In theory it might have been possible to import the Windows 3.x specific routines dynamically, and use a single binary for Windows 2.x and 3.x. In practice that is not workable because the drivers also need a different format of resources (Microsoft significantly changed the resource format between Windows 2.x and 3.0). It was therefore much simpler to create a separate Windows 2.x driver binary and use conditional compilation for using either Windows 3.x or 2.x code paths.
A related complication was that I could not find a resource compiler capable of dealing with Windows 2.x resources and running on 32-bit Windows. I resorted to running RC from a Windows 2.x SDK in a DOS VM in order to finalize the 2.x driver binary. Not pretty but fully functional.
All in all, it was an interesting retro development trip. And there’s more work to be done.
Update: An interesting problem was noticed in Windows 3.1 running in Enhanced 386 mode. A windowed DOS application performing a mode set (e.g. ‘MODE CO80’) would corrupt the display. Specifically the host VGA hardware (host from Windows 3.1’s point of view) would switch to planar mode, disrupting 256-color banked mode operation. This was unexpected since windowed DOS apps shouldn’t be able to do that.
This problem did not happen on Windows 3.0, and moreover it also did not happen when using VDDVGA30.386 on Windows 3.1 (the 3.0 compatible VDD or Virtual Display Driver is shipped with Windows 3.1 and some drivers use it).
Further probing established that on Windows 3.1, the display driver must call into the VDD and use the poorly documented VDDsetaddresses (0Ch) service. This subtly changes the behavior of VDDVGA. An internal fVDD_DspDrvrAware flag is set, which skips certain parts of the VDD logic.
Without the display driver linking up with the VDD, it appears that the VDD itself (as opposed to the windowed DOS box) modifies the VGA register state. This is likely not a problem for a display driver running in a planar VGA/EGA mode.
The exact logic is very poorly explained in the DDK documentation, but can be discerned from the VDDVGA source code in the Windows 3.1 DDK. As always, the source code is the best documentation.
I’ve been playing with Hack-1.03 and cross building for quite a few compilers with my main one being Microsoft C 6.0a. I use the MS-DOS Player from here:
It works like a ‘bind’ exe, so I wrap CL.EXE, and LIB.EXE. To link, I took the linker Version 5.60.220 from Visual C++ 1.52 on the Visual C++ 2.0 CD-ROM, along with nmake.
While I’m not currently using any assembly, I do want to be able to do some invalid opcodes to trigger stuff in an emulator so I’ll have to try this old MASM.
Since ms-dos player is interpretive it is slow, making the use of makefiles all the more important, but it doesn’t seem too bad to me. I had all kinds of weird cli/argument issues with the linker from Microsoft C 5/6 and found the one from VC 1.52 worked far better as a drop in replacement. I’ve found the Watcom linker less than satisfactory.
I am aware that this is a possibility. Building the driver takes several seconds on a very fast machine using native tools, so using emulation probably will be quite a lot slower. If I can run native code, I very much prefer that.
I know the Watcom linker very well so I can make it do what I need. Didn’t have any trouble with it really. It can link the Win16 display drivers just fine. I’m sure YMMV.
“Windows 3.1 running Word in a usable resolution”
Well, this is definitely true, but I see it partly as a joke as well 🙂
In fact, we have to admire how compact the Windows UI was. It had been designed for 640×480 and it showed. In both its initial incarnations (1.x/2.x/3.x and then 9x/NT4/2000) it was meant to fit on a VGA screen, even if it was a tight fit. Any better resolution just let the user fit more on screen.
From my past experience with Windows 3.x, I’d say that:
* Hercules, EGA and 640×480 were barely usable, and definitely not very useful. You could do some serious work, but it was painful, and there was definitely no way to enjoy multitasking. Just a bunch of full-screen apps and a lot of Alt+Tabbing, and Word reduced to 90% zoom.
* 800×600 was “just enough”. One could do most of the work quite effectively, and even use non-full-screen apps most of the time, unless they needed more viewing area (CorelDRAW!, Word, programming environments)
* 1024×768 was “perfect” – but with small fonts only. And it needed a damn good monitor. Not much of a surprise that the large fonts variant of this mode was more common for some time, which in turn effectively reduced the screen estate to something in between 640×480 and 800×600, but with higher quality text and graphics. (Small/Large Font modes is something that’s rarely mentioned in the retro community, by the way)
* 1280×1024 was a dream. I think it was reserved to professionals with their huge 19″ and 21″ monitors. CAD, graphics and so on. It was pretty much unobtainable without a good (and expensive) graphics card and a good (and expensive) monitor.
* Anything more? Unheard of, as far as Windows 1.x/2.x/3x is concerned, with the exception of some more obscure graphics systems perhaps (full-page A4 displays, for instance).
Yes, I totally agree that 640×480 was not quite enough, while 800×600 was not great yet already usable. 1024×768 was quite good and above that was luxury. Not only did that need an expensive monitor but also a good graphics card. There were always systems that could run Windows in high resolution, but far from common — it was probably a few thousand dollars extra.
I always hated the large fonts thing because it made the screen effectively smaller. I still don’t like it with Windows 10 🙂
Using small fonts is the sign that one is still young. Aging yields a desire for larger fonts. Windows made it comparatively easy to have the user assign the optimum looking fonts for their preferred resolution. It annoyed me how many designers decided to override the user’s choice for designer’s choice.
1024×768 was the resolution that would not die being the highest resolution affordable monitors could produce for about a decade.
Is it feasible to use MS-DOS player to bind the 2.x RC but leave the rest of the tools as native Win32? In my collection of tools, I don’t even see a (Microsoft) 3.x RC that’s native Win32 – even when the compilers and linkers are, this piece seems purely DOS. Do others have a PE version of this (and if so, where?)
For build performance, does wmake support multi-process builds? I’ve been maintaining my own make (ymake) which aims to be nmake compatible but retrofitting multi-process execution. Obviously this requires makefiles that specify dependencies correctly and don’t have multiple tasks try to update the same shared files, etc. On modern hardware sequential compilation seems needlessly frustrating.
(PS. I survived on 640×480 until 2002. It turns out when you don’t know what you’re missing, anything is usable.)
640×480 at 16 colors?
Yes, I expect it would be feasible to tweak just the Windows 2.x resource compiler. On the other hand, I have to move the driver to a DOS machine and copy it to the corresponding Windows directory, and it’s not hard to run RC there.
No, the build is not parallel, but it takes about 3 seconds from scratch. A different make tool could be used, but I would have to build the driver from scratch a lot to spend even five minutes on parallel building and have any chance of ever getting that time back. With bigger projects it’s a very different story.
This is fantastic work.
Are you going to look at having the resolution able to change dynamically, aka guest extensions ?
Are you going to look at more colours too ?
I know DOSEMU2 could really do with a driver this worked with.
Do you have any insight on how a seamless mode could be implemented with this ?
Windows 3.x is not in the least prepared to dynamically change resolutions. You have to re-initialize GDI to change resolutions. Sure, you can play games like set a large virtual resolution and change the physical resolution, but that’s not true dynamic change. Applications are likewise not prepared to deal with resolution changes at runtime. I’m sure some hackery could be done but I don’t see the point.
The usual approach is forcing the desktop background to be transparent. The driver does not really know what that is, but it should be possible to find out somehow on the higher layers.
Win-OS/2 figured out seamless mode, so it seems….. doable with video drivers. The bigger problem to solve was inter-process communication (OLE/DDE) when using protected separate Win16 application processes.
Many Win16 programs seem to have problems with high resolution screens anyway. It’s been cropping up with WineVDM. I wanted a giant window in SimTower running on my main machine, and it simply wasn’t programmed to make the main window bigger then 1280×1024 or something.
I don’t think that Win-OS/2 seamless was done in the display driver at all. Although I’m not entirely certain. All I know is that IBM had a special DDK add-on for Win-OS/2 display drivers, which was separate from the regular OS/2 DDK and I have never seen it. IBM also had source code for at least Windows 3.0 as well.
Same with NTVDM… it’s a bit different when you have all the source code and ship modified Windows 3.1 yourself.
1. Would it be possible to create a driver providing 16:9 resolutions? Like 1280×720, for example. DOSBox Staging, DOSBox X, QEMU, and Dosemu2 would greatly benefit from this.
2. How about HiColor modes (15-bit, 16-bit)?
3. Maybe the retro-focused PC emulators could provide some interface to effectively add a “hardware” acceleration for your driver?
Different resolutions would be easy. A different color depth would require a significant rework of all the rendering code, a major project. It might be easier to rewrite the driver from scratch. This is where Windows 95 made things so much easier.
As for acceleration, it’s of course possible, but of questionable utility. Windows 3.x software rendering on today’s systems is really fast, so after a lot of effort on the driver side you probably end up with barely noticeable improvements.
>I wanted a giant window in SimTower running on my main machine, and it simply wasn’t programmed to make the main window bigger then 1280×1024 or something.
Simearth has the same problem but even worse, it can only get to about 800×600. Civilization on the other hand can show the entire map at once if you can make the window large enough.
Nice. Do you have a git repo available someplace with the code?
No. Maybe some day.
Ok… And binaries?
Perhaps, once I’m satisfied with the functionality. At minimum I need to rework the driver configuration so that it allows flexible resolutions, and I would like to have some not entirely manual installation method.
For VirtualBox several patches can be found that extend the display up to 1024*768. A good test is to change the DOS prompt PIF into opening in a window, rather than using full screen. After checking that most patches will break. Seems to me that you explained some of that. Another issue I recall is memory usage together with networking, but I never took the effort to even try to analyze/optimize that. All these are based on using Enhanced mode. Hope to see some binaries one day!
Found some more information on palette handling in Windows 3.1x and earlier. All this comes from IBM Image Adapter/A driver docs. 256 colors is a special case and the driver allows for disabling the palette manager in 3.1x to simulate Windows 1/2 behavior in 8-bit color modes.
16-bit modes and higher don’t use palette manager and PALETTE.EXE always looks great. The tradeoff is the many hundreds of Windows 3.1x games and edutainment that break since they are hard coded to dynamically change the 8-bit palettes.
Re running the 16-bit resource compiler, perhaps otvdm might be useful?
(Sorry if this suggestion is already known, perhaps already discussed in another blog post or the comments of another blog post).
Another question: How much/little work would it be to make this driver work as a non-accelerated driver for real hardware that uses some sort of standard API to select higher resolutions (VESA?)?
I admit that I haven’t really invested much time in it, but my conclusion is that Windows 3.x runs fine on way newer hardware than it usually ran on, but with PCI and especially AGP cards it’s hard to find a Windows 3.x driver.
The use case for this would partially be vintage computing in general but also to use special hardware with Windows 3.x software on an as-good-possible computer. In my particular case I have a logic analyzer that uses an ISA card and Windows 3.x software and it would be nice to be able to run that in higher resolutions. I admit that I haven’t tried if it would work in Win9x, that would kind of solve the problem (except that my experience is that with perfectly bug free software Win 3.x can actually be rock solid, I’ve experienced several weeks of uptime for Win 3.x that was actually used and not just idling, while my experience is that Win 9x sooner or later deteriorates and eventually needs a reinstall).
(Sorry if I’ve already written this as a comment on another post. If so I blame that on getting old…)
regarding win16 programming, I wonder if there is any 3rd party applications can drag files into winfile? and how is win3.1 classic drag-and-drop interface(the non-OLE one) comparing with OS/2’s drag-and-drop interface?