Hi Andy, On Tue, Nov 9, 2021 at 9:58 AM Andy Lutomirski <luto@xxxxxxxxxx> wrote: > Here's an excerpt from an old email that I, perhaps unwisely, sent to Dave but not to a public list: > > static inline void write_pkru(u32 pkru) > { > struct pkru_state *pk; > > if (!boot_cpu_has(X86_FEATURE_OSPKE)) > return; > > pk = get_xsave_addr(¤t->thread.fpu.state.xsave, > XFEATURE_PKRU); > > /* > * The PKRU value in xstate needs to be in sync with the value > that is > * written to the CPU. The FPU restore on return to userland would > * otherwise load the previous value again. > */ > fpregs_lock(); > if (pk) > pk->pkru = pkru; > > ^^^ > else we just write to the PKRU register but leave XINUSE[PKRU] clear on > return to usermode? That seems... unwise. > > __write_pkru(pkru); > fpregs_unlock(); > } > > I bet you're hitting exactly this bug. The fix ended up being a whole series of patches, but the gist of it is that the write_pkru() slow path needs to set the xfeature bit in the xsave buffer and then do the write. It should be possible to make a little patch to do just this in a couple lines of code. I think you've got the right idea, the following patch does seem to fix the problem on this CPU, this is based on 5.13. It seems the changes to asm/pgtable.h were not enough, I also had to modify fpu/internal.h to get it working properly. >From e5e184d68ac6ca93c3cd2cc88d61af3260d1c014 Mon Sep 17 00:00:00 2001 From: Brian Geffon <bgeffon@xxxxxxxxxx> Date: Tue, 9 Nov 2021 17:08:25 +0000 Subject: [PATCH] x86/fpu: Set XFEATURE_PKRU when writing to pkru On kernels prior to 5.14 the write_pkru path could end up writing to the pkru register without updating the corresponding state. Signed-off-by: Brian Geffon <bgeffon@xxxxxxxxxx> --- arch/x86/include/asm/fpu/internal.h | 22 ++++++++++------------ arch/x86/include/asm/pgtable.h | 5 +++-- 2 files changed, 13 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index 16bf4d4a8159..ed2ce7d1afeb 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -564,18 +564,16 @@ static inline void switch_fpu_finish(struct fpu *new_fpu) * PKRU state is switched eagerly because it needs to be valid before we * return to userland e.g. for a copy_to_user() operation. */ - if (!(current->flags & PF_KTHREAD)) { - /* - * If the PKRU bit in xsave.header.xfeatures is not set, - * then the PKRU component was in init state, which means - * XRSTOR will set PKRU to 0. If the bit is not set then - * get_xsave_addr() will return NULL because the PKRU value - * in memory is not valid. This means pkru_val has to be - * set to 0 and not to init_pkru_value. - */ - pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU); - pkru_val = pk ? pk->pkru : 0; - } + /* + * If the PKRU bit in xsave.header.xfeatures is not set, + * then the PKRU component was in init state, which means + * XRSTOR will set PKRU to 0. If the bit is not set then + * get_xsave_addr() will return NULL because the PKRU value + * in memory is not valid. This means pkru_val has to be + * set to 0 and not to init_pkru_value. + */ + pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU); + pkru_val = pk ? pk->pkru : 0; __write_pkru(pkru_val); } diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index b1099f2d9800..d00fc2df4cfe 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -137,18 +137,19 @@ static inline u32 read_pkru(void) static inline void write_pkru(u32 pkru) { struct pkru_state *pk; + struct fpu *fpu = ¤t->thread.fpu; if (!boot_cpu_has(X86_FEATURE_OSPKE)) return; - pk = get_xsave_addr(¤t->thread.fpu.state.xsave, XFEATURE_PKRU); - /* * The PKRU value in xstate needs to be in sync with the value that is * written to the CPU. The FPU restore on return to userland would * otherwise load the previous value again. */ + fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_PKRU; fpregs_lock(); + pk = get_xsave_addr(&fpu->state.xsave, XFEATURE_PKRU); if (pk) pk->pkru = pkru; __write_pkru(pkru); -- 2.34.0.rc0.344.g81b53c2807-goog