On Fri, Feb 17, 2017 at 6:13 AM, Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > This patch introduces two new prctl(2) handles to manage maximum virtual > address available to userspace to map. So this is my least favorite patch of the whole series, for a couple of reasons: (a) adding new code, and mixing it with the mindless TASK_SIZE -> get_max_addr() conversion. (b) what's the point of that whole TASK_SIZE vs get_max_addr() thing? When use one, when the other? so I think this patch needs a lot more thought and/or explanation. Honestly, (a) is a no-brainer, and can be fixed by just splitting the patch up. But I think (b) is more fundamental. In particular, I think that get_max_addr() thing is badly defined. When should you use TASK_SIZE, when should you use TASK_SIZE_MAX, and when should you use get_max_addr()? I don't find that clear at all, and I think that needs to be a whole lot more explicit and documented. I also get he feeling that the whole thing is unnecessary. I'm wondering if we should just instead say that the whole 47 vs 56-bit virtual address is _purely_ about "get_unmapped_area()", and nothing else. IOW, I'm wondering if we can't just say that - if the processor and kernel support 56-bit user address space, then you can *always* use the whole space - but by default, get_unmapped_area() will only return mappings that fit in the 47 bit address space. So if you use MAP_FIXED and give an address in the high range, it will just always work, and the MM will always consider the task size to be the full address space. But for the common case where a process does no use MAP_FIXED, the kernel will never give a high address by default, and you have to do the process control thing to say "I want those high addresses". Hmm? In other words, I'd like to at least start out trying to keep the differences between the 47-bit and 56-bit models as simple and minimal as possible. Not make such a big deal out of it. We already have "arch_get_unmapped_area()" that controls the whole "what will non-MAP_FIXED mmap allocations return", so I'd hope that the above kind of semantics could be done without *any* actual TASK_SIZE changes _anywhere_ in the VM code. Comments? Linus