On Wed, Jan 11, 2017 at 10:49 AM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > On 01/11/2017 10:37 AM, Kirill A. Shutemov wrote: >>> How about preventing the max addr from being changed to too high a >>> value while MPX is on instead of overriding the set value? This would >>> have the added benefit that it would prevent silent failures where you >>> think you've enabled large addresses but MPX is also on and mmap >>> refuses to return large addresses. >> Setting rlimit high doesn't mean that you necessary will get access to >> full address space, even without MPX in picture. TASK_SIZE limits the >> available address space too. > > OK, sure... If you want to take another mechanism into account with > respect to MPX, we can do that. We'd just need to change every > mechanism we want to support to ensure that it can't transition in ways > that break MPX. > > What are you arguing here, though? Since we *might* be limited by > something else that we should not care about controlling the rlimit? > >> I think it's consistent with other resources in rlimit: setting RLIMIT_RSS >> to unlimited doesn't really means you are not subject to other resource >> management. > > The farther we get into this, the more and more I think using an rlimit > is a horrible idea. Its semantics aren't a great match, and you seem to > be resistant to making *this* rlimit differ from the others when there's > an entirely need to do so. We're already being bitten by "legacy" > rlimit. IOW, being consistent with *other* rlimit behavior buys us > nothing, only complexity. Taking a step back, I think it would be fantastic if we could find a way to make this work without any inheritable settings at all. Perhaps we could have a per-mm value that is initialized to 2^47-1 on execve() and can be raised by ELF note or by prctl()? Getting it right for 32-bit would require a bit of thought. The ELF note would make a high stack possible and, without the ELF note, we'd get a low stack but high mmap(). Then the messy bits can be glibc's problem and a toolchain problem as it should be, given that the only reason we need a limit at all is because of messy userspace code. Sure, the low stack prevents the *whole* address space from being used in one big block for databases, but 2^57 - 2^47 ought to be good enough. I'm not 100% sure this is workable but, if it is, it makes everyone's life easier. There's no need to muck around with setarch(1) or similar hacks. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html