On 06/22/2016 06:13 PM, Andy Lutomirski wrote:
On Wed, Jun 22, 2016 at 7:53 AM, Christopher Covington
<cov@xxxxxxxxxxxxxx> wrote:
+Andy, Cyrill, Dmitry who have been discussing variable TASK_SIZE on x86
on linux-mm
http://marc.info/?l=linux-mm&m=146290118818484&w=2
I was working on an (AArch64-specific) auxiliary vector entry to export
TASK_SIZE to userspace at exec time. The goal was to allow for more
elegant, robust, and efficient replacements for the following changes:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dfaafbaaa291
https://github.com/xemul/criu/commit/c0c0546c31e6df4932669f4740197bb830a24c8d
However based on the above discussion, it appears that some sort of
prctl(PR_GET_TASK_SIZE, ...) and prctl(PR_SET_TASK_SIZE, ...) may be
preferable for AArch64. (And perhaps other justifications for the new
calls influences the x86 decisions.) What do folks think?
I would advocate a slightly different approach:
- Keep TASK_SIZE either unconditionally matching the hardware or keep
TASK_SIZE as the actual logical split between user and kernel
addresses. Don't let it change at runtime under any circumstances.
The reason is that there have been plenty of bugs and
overcomplications that result from letting it vary. For example, if
(addr < TASK_SIZE) really ought to be the correct check (assuming
USER_DS, anyway) for whether dereferencing addr will access user
memory, at least on architectures with a global address space (which
is most of them, I think).
- If needed, introduce a clean concept of the maximum address that
mmap will return, but don't call it TASK_SIZE. So, if a user program
wants to limit itself to less than the full hardware VA space (or less
than 63 bits, for that matter), it can.
As an example, a 32-bit x86 program really could have something mapped
above the 32-bit boundary. It just wouldn't be useful, but the kernel
should still understand that it's *user* memory.
So you'd have PR_SET_MMAP_LIMIT and PR_GET_MMAP_LIMIT or similar instead.
I like to agree -- this approach seems clear.
It also complements your idea of unifying TASK_SIZE for x86 and leaving
only ADDR_LIMIT_32BIT setting with personality()
Also, before getting *too* excited about this kind of VA limit, keep
in mind that SPARC has invented this thingly called "Application Data
Integrity".
Thanks for the link -- what a good thing. I dream it could work not on
per-page basis, heh.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>