On Monday, January 2, 2017 10:08:28 PM CET Andy Lutomirski wrote: > > > This seems to nicely address the same problem on arm64, which has > > run into the same issue due to the various page table formats > > that can currently be chosen at compile time. > > On further reflection, I think this has very little to do with paging > formats except insofar as paging formats make us notice the problem. > The issue is that user code wants to be able to assume an upper limit > on an address, and it gets an upper limit right now that depends on > architecture due to paging formats. But someone really might want to > write a *portable* 64-bit program that allocates memory with the high > 16 bits clear. So let's add such a mechanism directly. > > As a thought experiment, what if x86_64 simply never allocated "high" > (above 2^47-1) addresses unless a new mmap-with-explicit-limit syscall > were used? Old glibc would continue working. Old VMs would work. > New programs that want to use ginormous mappings would have to use the > new syscall. This would be totally stateless and would have no issues > with CRIU. I can see this working well for the 47-bit addressing default, but what about applications that actually rely on 39-bit addressing (I'd have to double-check, but I think this was the limit that people were most interested in for arm64)? 39 bits seems a little small to make that the default for everyone who doesn't pass the extra flag. Having to pass another flag to limit the addresses introduces other problems (e.g. mmap from library call that doesn't pass that flag). > If necessary, we could also have a prctl that changes a > "personality-like" limit that is in effect when the old mmap was used. > I say "personality-like" because it would reset under exactly the same > conditions that personality resets itself. For "personality-like", it would still have to interact with the existing PER_LINUX32 and PER_LINUX32_3GB flags that do the exact same thing, so actually using personality might be better. We still have a few bits in the personality arguments, and we could combine them with the existing ADDR_LIMIT_3GB and ADDR_LIMIT_32BIT flags that are mutually exclusive by definition, such as ADDR_LIMIT_32BIT = 0x0800000, /* existing */ ADDR_LIMIT_3GB = 0x8000000, /* existing */ ADDR_LIMIT_39BIT = 0x0010000, /* next free bit */ ADDR_LIMIT_42BIT = 0x8010000, ADDR_LIMIT_47BIT = 0x0810000, ADDR_LIMIT_48BIT = 0x8810000, This would probably take only one or two personality bits for the limits that are interesting in practice. Arnd -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>