The following commit has been merged into the x86/urgent branch of tip: Commit-ID: ae6012d72fa60c9ff92de5bac7a8021a47458e5b Gitweb: https://git.kernel.org/tip/ae6012d72fa60c9ff92de5bac7a8021a47458e5b Author: Aruna Ramakrishna <aruna.ramakrishna@xxxxxxxxxx> AuthorDate: Tue, 19 Nov 2024 17:45:20 Committer: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> CommitterDate: Mon, 02 Dec 2024 15:25:29 -08:00 x86/pkeys: Ensure updated PKRU value is XRSTOR'd When XSTATE_BV[i] is 0, and XRSTOR attempts to restore state component 'i' it ignores any value in the XSAVE buffer and instead restores the state component's init value. This means that if XSAVE writes XSTATE_BV[PKRU]=0 then XRSTOR will ignore the value that update_pkru_in_sigframe() writes to the XSAVE buffer. XSTATE_BV[PKRU] only gets written as 0 if PKRU is in its init state. On Intel CPUs, basically never happens because the kernel usually overwrites the init value (aside: this is why we didn't notice this bug until now). But on AMD, the init tracker is more aggressive and will track PKRU as being in its init state upon any wrpkru(0x0). Unfortunately, sig_prepare_pkru() does just that: wrpkru(0x0). This writes XSTATE_BV[PKRU]=0 which makes XRSTOR ignore the PKRU value in the sigframe. To fix this, always overwrite the sigframe XSTATE_BV with a value that has XSTATE_BV[PKRU]==1. This ensures that XRSTOR will not ignore what update_pkru_in_sigframe() wrote. The problematic sequence of events is something like this: Userspace does: * wrpkru(0xffff0000) (or whatever) * Hardware sets: XINUSE[PKRU]=1 Signal happens, kernel is entered: * sig_prepare_pkru() => wrpkru(0x00000000) * Hardware sets: XINUSE[PKRU]=0 (aggressive AMD init tracker) * XSAVE writes most of XSAVE buffer, including XSTATE_BV[PKRU]=XINUSE[PKRU]=0 * update_pkru_in_sigframe() overwrites PKRU in XSAVE buffer ... signal handling * XRSTOR sees XSTATE_BV[PKRU]==0, ignores just-written value from update_pkru_in_sigframe() Fixes: 70044df250d0 ("x86/pkeys: Update PKRU to enable all pkeys before XSAVE") Suggested-by: Rudi Horn <rudi.horn@xxxxxxxxxx> Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@xxxxxxxxxx> Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> Acked-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> Link: https://lore.kernel.org/all/20241119174520.3987538-3-aruna.ramakrishna%40oracle.com --- arch/x86/kernel/fpu/xstate.h | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index 6b2924f..aa16f1a 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -72,10 +72,22 @@ static inline u64 xfeatures_mask_independent(void) /* * Update the value of PKRU register that was already pushed onto the signal frame. */ -static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru) +static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u64 mask, u32 pkru) { + u64 xstate_bv; + int err; + if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE))) return 0; + + /* Mark PKRU as in-use so that it is restored correctly. */ + xstate_bv = (mask & xfeatures_in_use()) | XFEATURE_MASK_PKRU; + + err = __put_user(xstate_bv, &buf->header.xfeatures); + if (err) + return err; + + /* Update PKRU value in the userspace xsave buffer. */ return __put_user(pkru, (unsigned int __user *)get_xsave_addr_user(buf, XFEATURE_PKRU)); } @@ -292,7 +304,7 @@ static inline int xsave_to_user_sigframe(struct xregs_state __user *buf, u32 pkr clac(); if (!err) - err = update_pkru_in_sigframe(buf, pkru); + err = update_pkru_in_sigframe(buf, mask, pkru); return err; }