I'm with you on the small model, and I did a little bit of programming working with an equivalent of large model (user data storage structures on an x86 graphing calculator), basically you normalize your address registers every time you move to the next object. This huge model code seems a little different than I was thinking of. It's still using basic 20 bit effective far pointers (20+12i bit?) and not 32 bit flat pointers in the code. Here's where it prevents MMU-like behavior: if there's no pointer arithmetic at all, no external calls have to be made, even when accessing data. What I was thinking of, is pointers are stored as flat 32 bit (4GB) pointers, and when code uses a pointer, it can do its own arithmetic in 32 bits, or it can make the kernel translate the address into a DS:SI or ES:DI combination (behind the scenes, swapping in the 64KB this makes visible if needed). Now the program has a pointer loaded in a "register" (DS/SI) and it can perform 16 bit arithmetic on it, until SI overflows/underflows, then it has to ask the OS to fix the pointer for it again. For bigger than 16 bit operations on a pointer, there would be calls like the huge code above used. It wouldn't be legal to ever load or store from segment registers directly, but once they're loaded, anything you can logically reference by changing the index is fair game. That covers sequential memcpy-like situations, another situation is random pointer access (eg, following a linked list or binary tree). The last pointer's 32 bit representation could be compared against the current 32 bit pointer to find the offset from the last DS:SI, then either add the offset onto SI or go to the OS provided pointer manipulation if it's outside the limits. This is much less ideal once the data size gets larger than 64KB because the window "hit rate" will drop drastically. But it's better than the situation for >~512KB of not running at all. Stack > 64KB could be done with the same logic if it's assumed that no single stack frame is bigger than 64KB (in reality probably only recursive calls have to care to check), and similar for code if no single function (or linked object?) is larger than 64KB, but these would add performance overheads as well. Finally, there are some "difficulty of working implementation" upsides: using the ABI "for real" has a lower barrier to entry if every pointer is kernel-dereferenced on every use. Slow as heck, but it should be easy to implement if the compiler is trained to do it. It's just a problem of optimizing after that (this was probably the most loaded statement in the whole message). Plus, it degrades gracefully into the small model (or some kind of "medium" with separate 64KB data, code, and stack) for simple programs, and probably existing compilers will do a good job here, maybe with some init hacks. tl;dr: I'd love me some nethack running on 8088. -- To unsubscribe from this list: send the line "unsubscribe linux-8086" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html