On 11/23/2017 04:38 AM, Florian Weimer wrote: > On 11/22/2017 05:32 PM, Dave Hansen wrote: >> On 11/22/2017 08:21 AM, Florian Weimer wrote: >>> On 11/22/2017 05:10 PM, Dave Hansen wrote: >>>> On 11/22/2017 04:15 AM, Florian Weimer wrote: >>>>> On 11/22/2017 09:18 AM, Vlastimil Babka wrote: >>>>>> And, was the pkey == -1 internal wiring supposed to be exposed to the >>>>>> pkey_mprotect() signal, or should there have been a pre-check >>>>>> returning >>>>>> EINVAL in SYSCALL_DEFINE4(pkey_mprotect), before calling >>>>>> do_mprotect_pkey())? I assume it's too late to change it now >>>>>> anyway (or >>>>>> not?), so should we also document it? >>>>> >>>>> I think the -1 case to the set the default key is useful because it >>>>> allows you to use a key value of -1 to mean “MPK is not supported”, >>>>> and >>>>> still call pkey_mprotect. >>>> >>>> The behavior to not allow 0 to be set was unintentional and is a bug. >>>> We should fix that. >>> >>> On the other hand, x86-64 has no single default protection key due to >>> the PROT_EXEC emulation. >> >> No, the default is clearly 0 and documented to be so. The PROT_EXEC >> emulation one should be inaccessible in all the APIs so does not even >> show up as *being* a key in the API. I should have been more explicit: the EXEC pkey does not show up in the syscall API. > I see key 1 in /proc for a PROT_EXEC mapping. If I supply an explicit > protection key, that key is used, and the page ends up having read > access enabled. > > The key is also visible in the siginfo_t argument on read access to a > PROT_EXEC mapping with the default key, so it's not just /proc: > > page 1 (0x7f008242d000): read access denied > SIGSEGV address: 0x7f008242d000 > SIGSEGV code: 4 > SIGSEGV key: 1 > > I'm attaching my test. Yes, it is exposed there. But, as a non-allocated pkey, the intention in the kernel was to make sure that it could not be passed to the syscalls. If that behavior is broken, we should probably fix it. >> The fact that it's implemented >> with pkeys should be pretty immaterial other than the fact that you >> can't touch the high bits in PKRU. > > I don't see a restriction for PKRU updates. If I write zero to the PKRU > register, PROT_EXEC implies PROT_READ, as I would expect. I'll rephrase: The fact that it's implemented with pkeys should be pretty immaterial other than the fact that you must not touch the bits controlling PROT_EXEC in PKRU if you want to keep it working. There is no restriction which is *enforced*. It's just documented.