On Tue, Nov 9, 2021 at 1:58 PM Brian Geffon <bgeffon@xxxxxxxxxx> wrote: > > Hi Andy, > > On Tue, Nov 9, 2021 at 9:58 AM Andy Lutomirski <luto@xxxxxxxxxx> wrote: > > Here's an excerpt from an old email that I, perhaps unwisely, sent to Dave but not to a public list: > > > > static inline void write_pkru(u32 pkru) > > { > > struct pkru_state *pk; > > > > if (!boot_cpu_has(X86_FEATURE_OSPKE)) > > return; > > > > pk = get_xsave_addr(¤t->thread.fpu.state.xsave, > > XFEATURE_PKRU); > > > > /* > > * The PKRU value in xstate needs to be in sync with the value > > that is > > * written to the CPU. The FPU restore on return to userland would > > * otherwise load the previous value again. > > */ > > fpregs_lock(); > > if (pk) > > pk->pkru = pkru; > > > > ^^^ > > else we just write to the PKRU register but leave XINUSE[PKRU] clear on > > return to usermode? That seems... unwise. > > > > __write_pkru(pkru); > > fpregs_unlock(); > > } > > > > I bet you're hitting exactly this bug. The fix ended up being a whole series of patches, but the gist of it is that the write_pkru() slow path needs to set the xfeature bit in the xsave buffer and then do the write. It should be possible to make a little patch to do just this in a couple lines of code. > > I think you've got the right idea, the following patch does seem to > fix the problem on this CPU, this is based on 5.13. It seems the > changes to asm/pgtable.h were not enough, I also had to modify > fpu/internal.h to get it working properly. > Actually, it seems that only the changes to fpu/internal.h seem necessary. I guess the switch_fpu_finish explains how it's reverting to the initial value. diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 16bf4d4a8159..ed2ce7d1afeb 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -564,18 +564,16 @@ static inline void switch_fpu_finish(struct fpu *new_fpu) * PKRU state is switched eagerly because it needs to be valid before we * return to userland e.g. for a copy_to_user() operation. */ - if (!(current->flags & PF_KTHREAD)) { - /* - * If the PKRU bit in xsave.header.xfeatures is not set, - * then the PKRU component was in init state, which means - * XRSTOR will set PKRU to 0. If the bit is not set then - * get_xsave_addr() will return NULL because the PKRU value - * in memory is not valid. This means pkru_val has to be - * set to 0 and not to init_pkru_value. - */ - pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU); - pkru_val = pk ? pk->pkru : 0; - } + /* + * If the PKRU bit in xsave.header.xfeatures is not set, + * then the PKRU component was in init state, which means + * XRSTOR will set PKRU to 0. If the bit is not set then + * get_xsave_addr() will return NULL because the PKRU value + * in memory is not valid. This means pkru_val has to be + * set to 0 and not to init_pkru_value. + */ + pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU); + pkru_val = pk ? pk->pkru : 0; __write_pkru(pkru_val); }