(Sorry having issues with my IPv6 setup that duplicated the original email... On Fri, Sep 06, 2024 at 09:14:08AM GMT, Arnd Bergmann wrote: > On Fri, Sep 6, 2024, at 08:14, Lorenzo Stoakes wrote: > > On Fri, Sep 06, 2024 at 07:17:44AM GMT, Arnd Bergmann wrote: > >> On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote: > >> > Create a personality flag ADDR_LIMIT_47BIT to support applications > >> > that wish to transition from running in environments that support at > >> > most 47-bit VAs to environments that support larger VAs. This > >> > personality can be set to cause all allocations to be below the 47-bit > >> > boundary. Using MAP_FIXED with mmap() will bypass this restriction. > >> > > >> > Signed-off-by: Charlie Jenkins <charlie@xxxxxxxxxxxx> > >> > >> I think having an architecture-independent mechanism to limit the size > >> of the 64-bit address space is useful in general, and we've discussed > >> the same thing for arm64 in the past, though we have not actually > >> reached an agreement on the ABI previously. > > > > The thread on the original proposals attests to this being rather a fraught > > topic, and I think the weight of opinion was more so in favour of opt-in > > rather than opt-out. > > You mean opt-in to using the larger addresses like we do on arm64 and > powerpc, while "opt-out" means a limit as Charlie suggested? I guess I'm not using brilliant terminology here haha! To clarify - the weight of opinion was for a situation where the address space is limited, except if you set a hint above that (you could call that opt-out or opt-in depending which way you look at it, so yeah ok very unclear sorry!). It was against the MAP_ flag and also I think a _flexible_ per-process limit is also questionable as you might end up setting a limit which breaks something else, and this starts getting messy quick. To be clear, the ADDR_LIMIT_47BIT suggestion is absolutely a compromise and practical suggestion. > > >> > @@ -22,6 +22,7 @@ enum { > >> > WHOLE_SECONDS = 0x2000000, > >> > STICKY_TIMEOUTS = 0x4000000, > >> > ADDR_LIMIT_3GB = 0x8000000, > >> > + ADDR_LIMIT_47BIT = 0x10000000, > >> > }; > >> > >> I'm a bit worried about having this done specifically in the > >> personality flag bits, as they are rather limited. We obviously > >> don't want to add many more such flags when there could be > >> a way to just set the default limit. > > > > Since I'm the one who suggested it, I feel I should offer some kind of > > vague defence here :) > > > > We shouldn't let perfect be the enemy of the good. This is a relatively > > straightforward means of achieving the aim (assuming your concern about > > arch_get_mmap_end() below isn't a blocker) which has the least impact on > > existing code. > > > > Of course we can end up in absurdities where we start doing > > ADDR_LIMIT_xxBIT... but again - it's simple, shouldn't represent an > > egregious maintenance burden and is entirely opt-in so has things going for > > it. > > I'm more confused now, I think most importantly we should try to > handle this consistently across all architectures. The proposed > implementation seems to completely block addresses above BIT(47) > even for applications that opt in by calling mmap(BIT(47), ...), > which seems to break the existing applications. Hm, I thought the commit message suggested the hint overrides it still? The intent is to optionally be able to run a process that keeps higher bits free for tagging and to be sure no memory mapping in the process will clobber these (correct me if I'm wrong Charlie! :) So you really wouldn't want this if you are using tagged pointers, you'd want to be sure literally nothing touches the higher bits. > > If we want this flag for RISC-V and also keep the behavior of > defaulting to >BIT(47) addresses for mmap(0, ...) how about > changing arch_get_mmap_end() to return the limit based on > ADDR_LIMIT_47BIT and then make this default to enabled on > arm64 and powerpc but disabled on riscv? But you wouldn't necessarily want all processes to be so restricted, I think this is what Charlie's trying to avoid :) On the ohter hand - I'm not sure there are many processes on any arch that'd want the higher mappings. So that'd push us again towards risc v just limiting to 48-bits and only mapping above this if a hint is provided like x86-64 does (and as you mentioned via irc - it seems risc v is an outlier in that DEFAULT_MAP_WINDOW == TASK_SIZE). This would be more consistent vs. other arches. > > >> It's also unclear to me how we want this flag to interact with > >> the existing logic in arch_get_mmap_end(), which attempts to > >> limit the default mapping to a 47-bit address space already. > > > > How does ADDR_LIMIT_3GB presently interact with that? > > That is x86 specific and only relevant to compat tasks, limiting > them to 3 instead of 4 GB. There is also ADDR_LIMIT_32BIT, which > on arm32 is always set in practice to allow 32-bit addressing > as opposed to ARMv2 style 26-bit addressing (IIRC ARMv3 supported > both 26-bit and 32-bit addressing, while ARMv4 through ARMv7 are > 32-bit only. OK, I understand what it's for, I missed it was arch-specific bit, urgh. I'd say this limit should be min of the arch-specific limit vs. the 48-bit limit. If you have a 36-bit address space obviously it'd be rather unwise to try to provide 48 bit addresses.. > > Arnd