Brian Dessent wrote:
This is mostly how the existing the 'small' memory model works, which is
the default. All code and data are in the lower 2GB of address space,
which allows the use of 32 bit relocs and 32 bit PC-relative branches
which saves a lot of overhead in the common cases.
IIUC, that only applies to addresses that must be known at load time.
Addresses not known until run time may go beyond 4GB even in a small
memory model.
For this discussion, there are three types of data objects:
1) Objects whose addresses are known at load time.
2) Objects allocated on the stack.
3) Objects allocated in the heap.
If I understand the point of this thread, the main focus would be on
(3). Small memory model only addresses (1).
4GB is big enough that you could fit the heap and stack in there as well
for most programs. But I don't know enough about the loader in Linux to
know if it could cooperate.
For my own purposes, cramming the whole heap and stack into 4GB would
defeat the whole purpose. If the problem is so small that everything
fits in 4GB, it is less likely to be so non localized that 64 bit
pointers greatly hurt the L2 cache. I care about the case where an
identified subset of the data could be allocated from a 4GB pool and
there are a LOT of pointers into or within that subset of data.
I'm not sure which case the OP cares about.