On 2019-03-08 11:01:25 [-0800], Dave Hansen wrote: > On 3/8/19 10:08 AM, Sebastian Andrzej Siewior wrote: > > On 2019-02-25 10:16:24 [-0800], Dave Hansen wrote: > >>> + if (!cpu_feature_enabled(X86_FEATURE_OSPKE)) > >>> + return; > >>> + > >>> + if (current->mm) { > >>> + pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU); > >>> + WARN_ON_ONCE(!pk); … > Nothing will break, but the warning will trigger, which isn't nice. the warning should trigger if something goes south, I was not expecting it to happen. > > My understanding is that the in-kernel XSAVE will always save everything > > so we should never "lose" the XFEATURE_PKRU no matter what user space > > does. > > > > So as test case you want > > xsave (-1 & ~XFEATURE_PKRU) > > xrestore (-1 & ~XFEATURE_PKRU) > > > > in userland and then a context switch to see if the warning above > > triggers? > > I think you need an XRSTOR with RFBM=-1 (or at least with the PKRU bit > set) and the PKRU bit in the XFEATURES field in the XSAVE buffer set to 0. let me check that, write a test case in userland and I come back with the results. I can remove that warning but I wasn't expecting it to trigger so let me verify that first. > >>> + if (pk) > >>> + pkru_val = pk->pkru; > >>> + }> + __write_pkru(pkru_val); > >>> } > >> > >> A comment above __write_pkru() would be nice to say that it only > >> actually does the slow instruction on changes to the value. > > > > Could we please not do this? It is a comment above one of the callers > > function and we have two or three. And we have that comment already > > within __write_pkru(). > > I looked at this code and thought "writing PKRU is slow", and "this > writes PKRU unconditionally", and "the __ version of the function > shoudn't have much logic in it". > > I got 2/3 wrong. To me that means this site needs a 1-line comment. > Feel free to move one of the other comments to here if you think it's > over-commented, but this site needs one. right because things changed as part of patch series. You wanted to have in __write_pkru() the same semantic like in __read_pkru() which is currently the case because __write_pkru() has the check. It would be great if we could rename it to something else and avoid the comment. (Because if this user gets a comment then other should, too and I think this is an overkill). > > Last time we talked about this we agreed (or this was my impression) that > > 0 should be written so that the kernel thread should always be able to > > write to user space in case it borrowed its mm (otherwise it has none > > and it would fail anyway). > > We can't write to userspace when borrowing an mm. If the kernel borrows > an mm, we might as well be on the init_mm which has no userspace mappings. If a kernel thread borrows a mm from a user task via use_mm() then it _can_ write to that task's user land memory from a kthread. > > We didn't want to leave PKRU alone because the outcome (whether or not > > the write by the kernel thread succeeds) should not depend on the last > > running task (and be random) but deterministic. > > Right, so let's make it deterministically restrictive: either > init_pkru_value, or -1 since kernel threads shouldn't be touching > userspace in the first place. I'm fine either way, just tell me what you want. Just consider the use_mm() part above I wrote. (I remember you/luto suggest to have an API for something like that so that the PKRU value can be Sebastian