Wow.. Lots of stuff I haven't thought about in years, and a reminder of why I stopped programming in Assembly when protected mode became common!! I love that you reference using the video memory space for programs. It reminds me of a trick I used once long ago when using an ADA compiler that was only capable of making .COM files (not .EXE) which were limited to 64k.. We ran out of room in our 64k memory space, and in trying anything and everything to optimize, we realized that while our monochrome video card had 2k of RAM onboard, it was only 80x25 which was 2000 bytes, leaving 48 bytes of precious RAM available and unused. We decided to start using those bytes for variable storage. It worked great until one day our boss decided to upgrade our demo system to a new VGA card and monitor which he just bought.. Removing the old monochrome adapter the night before an important customer demo and installing the VGA, our program began to fail in interesting ways we had never seen before. It took us some debugging time but finally realized that the 48 bytes we had long since forgotten we were using for variables was now gone. Reinstalling the old monochrome card just for it's 2k of memory had us working again though, and were still able to use the new VGA card for primary display.. Gotta love the old days before page faults!!
I have great memories of _using_ DJGPP back in the 90s, but never knew how it worked internally. Digging through the details for this article was hard enough already and I'm not sure how tricky it will be to get details for DJGPP. But at least the code is all out there, and now you too :) Maybe I'll have some questions indeed.
The internals of 1.x were a huge mess, because of the hacks required to get back to real mode from protected mode. "Reboot with magic numbers" mostly. V2 went with a strict "use DPMI" and cleaned a lot of it up.
Eli's paper is a great source, as is the history link on the main djgpp page. We put a lot of effort into getting those right.
There's an error: "As an example, think about the video memory mappings in the original PC specification. The memory map reserved two chunks of the address space for video: one for monochrome displays and one for color displays. But only one of them can be in use at any given time. So whichever video mode is selected leaves the address space of the other mode unused, and thus such address space can be leveraged by the system as an UMB into which to load drivers or place user data".
Back at the days was usual (ex. for crackers or CAD user) to have dual monitor, one MDA and one VGA. So you have both framebuffers for both adapters and you can of course have UMB, bot not in both areas.
A great article; the diagrams in particular are really useful to make things understandable.
Two remarks though:
- In the protected mode diagram, the base address of the segment should be 0x81DA0, and the final resolved address should be 0x89063 (assuming that your example was supposed to be the same as for real mode before).
- A round-trip though protected mode to copy data to and from XMS may seem like the most logical way to do things, but this is normally not what the actual HIMEM.SYS did. The main reason is that protected mode round-trips are slow; GDTs, LDTs and IDTs need to be set up, and since the 286 couldn't switch back from protected mode to real mode, it had to be triple faulted into a soft reset, with careful restoration of the original state after the fact. All this could take several thousand cycles. There was, however, a better way to do things, in the form of the officially undocumented but nevertheless widely used LOADALL instruction. This permitted more or less direct access to the CPU's segment descriptor cache, and could be used to assign an arbitrary base address (including above 1MiB!) to one of the segment registers, making the copy from/to high memory possible directly from real mode. This was way faster and more convenient, so every popular XMS implementation used it.
Ah, awesome! Thanks for these details. I was doubtful that XMS worked with round trips to protected mode and back for the cost reasons you describe, but I was not able to find any details on how it could be made more efficient.
I'll review the diagram later and try to fix the XMS explanation accordingly.
And you've just got to mention that they used the keyboard controller to implement the back to real mode hack - just so you appreciate just how much of a hack it was :-)
Ah, of course, I forgot about that. Did anybody use the reset-via-8042 method though? From what I've read, the triple fault method seems to be the most prevalent way to reset the CPU, at least as far as PM->RM switches are concerned.
In djgpp's case, we could assume we had at least an 80386, which didn't need hacks to return to real mode. We just reset the PE bit in CR0. The keyboard controller method was *slow* compared to everything else.
To redirect the output generated by MEM /D into a single contiguous text file, simply run:
MEM /D >MEMD.TXT
Duh... I'm not sure how that didn't even cross my mind!
Well, you had a lot on your mind. Very comprehensive.
Wow.. Lots of stuff I haven't thought about in years, and a reminder of why I stopped programming in Assembly when protected mode became common!! I love that you reference using the video memory space for programs. It reminds me of a trick I used once long ago when using an ADA compiler that was only capable of making .COM files (not .EXE) which were limited to 64k.. We ran out of room in our 64k memory space, and in trying anything and everything to optimize, we realized that while our monochrome video card had 2k of RAM onboard, it was only 80x25 which was 2000 bytes, leaving 48 bytes of precious RAM available and unused. We decided to start using those bytes for variable storage. It worked great until one day our boss decided to upgrade our demo system to a new VGA card and monitor which he just bought.. Removing the old monochrome adapter the night before an important customer demo and installing the VGA, our program began to fail in interesting ways we had never seen before. It took us some debugging time but finally realized that the 48 bytes we had long since forgotten we were using for variables was now gone. Reinstalling the old monochrome card just for it's 2k of memory had us working again though, and were still able to use the new VGA card for primary display.. Gotta love the old days before page faults!!
I'm looking forward to the DJGPP article. I hope I find it nostalgic ;-)
Feel free to contact me if you need any history clarified, assuming I can still remember any of it.
Wow, hi! Glad to have you here!
I have great memories of _using_ DJGPP back in the 90s, but never knew how it worked internally. Digging through the details for this article was hard enough already and I'm not sure how tricky it will be to get details for DJGPP. But at least the code is all out there, and now you too :) Maybe I'll have some questions indeed.
For now I'm going off http://www.delorie.com/djgpp/doc/eli-m17n99.html, which is pretty comprehensive!
The internals of 1.x were a huge mess, because of the hacks required to get back to real mode from protected mode. "Reboot with magic numbers" mostly. V2 went with a strict "use DPMI" and cleaned a lot of it up.
Eli's paper is a great source, as is the history link on the main djgpp page. We put a lot of effort into getting those right.
My cousin had a ton of customised boot options set in config.sys and autoexec.bat and nearly choked me when I ran memmaker which messed all that up!
Good old memories.
There's an error: "As an example, think about the video memory mappings in the original PC specification. The memory map reserved two chunks of the address space for video: one for monochrome displays and one for color displays. But only one of them can be in use at any given time. So whichever video mode is selected leaves the address space of the other mode unused, and thus such address space can be leveraged by the system as an UMB into which to load drivers or place user data".
Back at the days was usual (ex. for crackers or CAD user) to have dual monitor, one MDA and one VGA. So you have both framebuffers for both adapters and you can of course have UMB, bot not in both areas.
I've just started reading the article but THE processor for PCs was 8088 and not 8086.
Nice article. I even went further that time by using "Stealth RAM" from Quarterdeck's QEMM https://en.wikipedia.org/wiki/QEMM. Totally Voodoo :)
A great article; the diagrams in particular are really useful to make things understandable.
Two remarks though:
- In the protected mode diagram, the base address of the segment should be 0x81DA0, and the final resolved address should be 0x89063 (assuming that your example was supposed to be the same as for real mode before).
- A round-trip though protected mode to copy data to and from XMS may seem like the most logical way to do things, but this is normally not what the actual HIMEM.SYS did. The main reason is that protected mode round-trips are slow; GDTs, LDTs and IDTs need to be set up, and since the 286 couldn't switch back from protected mode to real mode, it had to be triple faulted into a soft reset, with careful restoration of the original state after the fact. All this could take several thousand cycles. There was, however, a better way to do things, in the form of the officially undocumented but nevertheless widely used LOADALL instruction. This permitted more or less direct access to the CPU's segment descriptor cache, and could be used to assign an arbitrary base address (including above 1MiB!) to one of the segment registers, making the copy from/to high memory possible directly from real mode. This was way faster and more convenient, so every popular XMS implementation used it.
Ah, awesome! Thanks for these details. I was doubtful that XMS worked with round trips to protected mode and back for the cost reasons you describe, but I was not able to find any details on how it could be made more efficient.
I'll review the diagram later and try to fix the XMS explanation accordingly.
And you've just got to mention that they used the keyboard controller to implement the back to real mode hack - just so you appreciate just how much of a hack it was :-)
The keyboard controller hack was for the A20 gate, not for returning to real mode (a.k.a. resetting the processor).
They keyboard controller did both, and other things.
Ah, of course, I forgot about that. Did anybody use the reset-via-8042 method though? From what I've read, the triple fault method seems to be the most prevalent way to reset the CPU, at least as far as PM->RM switches are concerned.
In djgpp's case, we could assume we had at least an 80386, which didn't need hacks to return to real mode. We just reset the PE bit in CR0. The keyboard controller method was *slow* compared to everything else.