* Dave Hansen <dave@xxxxxxxx> wrote: > On 10/01/2015 01:39 PM, Kees Cook wrote: > > On Thu, Oct 1, 2015 at 4:17 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote: > >> So could we try to add an (opt-in) kernel option that enables this transparently > >> and automatically for all PROT_EXEC && !PROT_WRITE mappings, without any > >> user-space changes and syscalls necessary? > > > > I would like this very much. :) > > Here it is in a quite fugly form (well, it's not opt-in). Init crashes > if I boot with this, though. > > I'll see if I can turn it in to a bit more of an opt-in and see what's > actually going wrong. So the reality of modern Linux distros is that, according to some limited strace-ing around, pure PROT_EXEC usage does not seem to exist: 99% of executable mappings are mapped via PROT_EXEC|PROT_READ. So the most usable kernel testing approach would be to enable these types of pkeys for a child task via some mechanism and inherit it to all children (including inheriting it over non-suid exec) - but not to any other task. You could hijack a new personality bit just for debug purposes - see the (totally untested) patch below. Depending on user-space's assumptions it might not end up being anything usable we can apply, but it would be a great testing tool if it worked to a certain degree. I.e. allow the system to boot in without pkeys set for any task, then set the personality of a shell process to PER_LINUX_PKEYS and see which binaries (if any!) will start up without segfaulting. This way you don't have to debug SystemD, which is extremely fragile and passive-aggressive towards kernels that don't behave in precisely the fashion under which SystemD is being developed. Thanks, Ingo ========> Absolutely-Not-Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> include/uapi/linux/personality.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/uapi/linux/personality.h b/include/uapi/linux/personality.h index aa169c4339d2..bead47213419 100644 --- a/include/uapi/linux/personality.h +++ b/include/uapi/linux/personality.h @@ -8,6 +8,7 @@ * These occupy the top three bytes. */ enum { + PROT_READ_EXEC_HACK = 0x0010000, /* PROT_READ|PROT_EXEC == PROT_EXEC hack */ UNAME26 = 0x0020000, ADDR_NO_RANDOMIZE = 0x0040000, /* disable randomization of VA space */ FDPIC_FUNCPTRS = 0x0080000, /* userspace function ptrs point to descriptors @@ -41,6 +42,7 @@ enum { enum { PER_LINUX = 0x0000, PER_LINUX_32BIT = 0x0000 | ADDR_LIMIT_32BIT, + PER_LINUX_PKEYS = 0x0000 | PROT_READ_EXEC_HACK, PER_LINUX_FDPIC = 0x0000 | FDPIC_FUNCPTRS, PER_SVR4 = 0x0001 | STICKY_TIMEOUTS | MMAP_PAGE_ZERO, PER_SVR3 = 0x0002 | STICKY_TIMEOUTS | SHORT_INODE, -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>