Re: Some words of encouragement

Brad Normand <bigbadbrad@xxxxxxxxx> · Sat, 25 Feb 2012 12:04:49 -0600

I'm with you on the small model, and I did a little bit of programming
working with an equivalent of large model (user data storage
structures on an x86 graphing calculator), basically you normalize
your address registers every time you move to the next object.

This huge model code seems a little different than I was thinking of.
It's still using basic 20 bit effective far pointers (20+12i bit?) and
not 32 bit flat pointers in the code.  Here's where it prevents
MMU-like behavior: if there's no pointer arithmetic at all, no
external calls have to be made, even when accessing data.

What I was thinking of, is pointers are stored as flat 32 bit (4GB)
pointers, and when code uses a pointer, it can do its own arithmetic
in 32 bits, or it can make the kernel translate the address into a
DS:SI or ES:DI combination (behind the scenes, swapping in the 64KB
this makes visible if needed).  Now the program has a pointer loaded
in a "register" (DS/SI) and it can perform 16 bit arithmetic on it,
until SI overflows/underflows, then it has to ask the OS to fix the
pointer for it again.  For bigger than 16 bit operations on a pointer,
there would be calls like the huge code above used.  It wouldn't be
legal to ever load or store from segment registers directly, but once
they're loaded, anything you can logically reference by changing the
index is fair game.

That covers sequential memcpy-like situations, another situation is
random pointer access (eg, following a linked list or binary tree).
The last pointer's 32 bit representation could be compared against the
current 32 bit pointer to find the offset from the last DS:SI, then
either add the offset onto SI or go to the OS provided pointer
manipulation if it's outside the limits.  This is much less ideal once
the data size gets larger than 64KB because the window "hit rate" will
drop drastically.  But it's better than the situation for >~512KB of
not running at all.

Stack > 64KB could be done with the same logic if it's assumed that no
single stack frame is bigger than 64KB (in reality probably only
recursive calls have to care to check), and similar for code if no
single function (or linked object?) is larger than 64KB, but these
would add performance overheads as well.

Finally, there are some "difficulty of working implementation"
upsides:  using the ABI "for real" has a lower barrier to entry if
every pointer is kernel-dereferenced on every use.  Slow as heck, but
it should be easy to implement if the compiler is trained to do it.
It's just a problem of optimizing after that (this was probably the
most loaded statement in the whole message).  Plus, it degrades
gracefully into the small model (or some kind of "medium" with
separate 64KB data, code, and stack) for simple programs, and probably
existing compilers will do a good job here, maybe with some init
hacks.

tl;dr:  I'd love me some nethack running on 8088.
--
To unsubscribe from this list: send the line "unsubscribe linux-8086" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html