On Tue, Jul 12, 2016 at 10:12 AM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote: > On 07/12/2016 09:32 AM, Andy Lutomirski wrote: >> I think it's more or less impossible to get sensible behavior passing >> pkey != 0 data to legacy functions. If you call: >> >> void frob(struct foo *p); >> >> If frob in turn passes p to a thread, what PKRU is it supposed to use? > > The thread inheritance of PKRU can be nice. It actually gives things a > good chance of working if you can control PKRU before clone(). I'd > describe the semantics like this: > > PKRU values are inherited at the time of a clone() system > call. Threads unaware of protection keys may work on > protection-key-protected data as long as PKRU is set up in > advance of the clone() and never needs to be changed inside the > thread. > > If a thread is created before PKRU is set appropriately, the > thread may not be able to act on protection-key-protected data. Given the apparent need for seccomp's TSYNC, I'm a bit nervous that this will be restrictive to a problematic degree. > > Otherwise, the semantics are simpler, but they basically give threads no > chance of ever working: > > Threads unaware of protection keys and which can not manage > PKRU may not operate on data where a non-zero key has been > passed to pkey_mprotect(). > > It isn't clear to me that one of these is substantially better than the > other. It's fairly easy in either case for an app that cares to get the > behavior of the other. > > But, one is clearly easier to implement in the kernel. :) > >>>> So how is user code supposed lock down all of its threads? >>>> >>>> seccomp has TSYNC for this, but I don't think that PKRU allows >>>> something like that. >>> >>> I'm not sure this is possible for PKRU. Think of a simple PKRU >>> manipulation in userspace: >>> >>> pkru = rdpkru(); >>> pkru |= PKEY_DENY_ACCESS<<key*2; >>> wrpkru(pkru); >>> >>> If we push a PKRU value into a thread between the rdpkru() and wrpkru(), >>> we'll lose the content of that "push". I'm not sure there's any way to >>> guarantee this with a user-controlled register. >> >> We could try to insist that user code uses some vsyscall helper that >> tracks which bits are as-yet-unassigned. That's quite messy, though. > > Yeah, doable, but not without some new data going out to userspace, plus > the vsyscall code itself. > >> We could also arbitrarily partition the key space into >> initially-wide-open, initially-read-only, and initially-no-access and >> let pkey_alloc say which kind it wants. > > The point is still that wrpkru destroyed the 'push' operation. You > always end up with a PKRU that (at least temporarily) ignored the 'push'. > Not with my partitioning proposal. We'd never asynchronously modify another thread's state -- we'd start start with a mask that gives us a good chance of having the initial state always be useful. To be completely precise, the initial state would be something like: 0 = all access, 1 (PROT_EXEC) = deny read and write, 2-11: deny read and write, 12-21: deny write, 22-31: all access Then pkru_alloc would take a parameter giving the requested initial state, and it would only work if a key with that initial state is available. If we went with the vdso approach, the API could look like: pkru_state_t prev = pkru_push(mask, value); ... pkru_pop(prev); // or pkru_pop(mask, prev)? This doesn't fundamentally require the vdso, except that implementing bitwise operations on PKRU can't be done atomically with RDPKRU / WRPKRU. Grr. This also falls apart pretty badly when sigreturn happens, so I don't think I like this approach. --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>