On Mon, Feb 21, 2022, Chenyi Qiang wrote: > @@ -277,14 +278,18 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > WARN_ON(pfec & (PFERR_PK_MASK | PFERR_RSVD_MASK)); > if (unlikely(mmu->pkr_mask)) { > u32 pkr_bits, offset; > + u32 pkr; > > /* > - * PKRU defines 32 bits, there are 16 domains and 2 > - * attribute bits per domain in pkru. pte_pkey is the > - * index of the protection domain, so pte_pkey * 2 is > - * is the index of the first bit for the domain. > + * PKRU and PKRS both define 32 bits. There are 16 domains > + * and 2 attribute bits per domain in them. pte_key is the > + * index of the protection domain, so pte_pkey * 2 is the > + * index of the first bit for the domain. The use of PKRU > + * versus PKRS is selected by the address type, as determined > + * by the U/S bit in the paging-structure entries. > */ > - pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3; > + pkr = pte_access & PT_USER_MASK ? vcpu->arch.pkru : kvm_read_pkrs(vcpu); Blindly reading PKRU/PKRS is wrong. I think this magic insanity will be functionally correct due to update_pkr_bitmask() clearing the appropriate bits in pkr_mask based on CR4.PK*, but the read should never happen. PKRU is benign, but I believe reading PKRS will result in VMREAD to an invalid field if PKRU is supported and enabled, but PKRS is not supported. I belive the easiest solution is: if (pte_access & PT_USER_MASK) pkr = is_cr4_pke(mmu) ? vcpu->arch.pkru : 0; else pkr = is_cr4_pks(mmu) ? kvm_read_pkrs(vcpu) : 0; The is_cr4_pk*() helpers are restricted to mmu.c, but this presents a good opportunity to extra the PKR stuff to a separate, non-inline helper (as a prep patch). E.g. WARN_ON(pfec & (PFERR_PK_MASK | PFERR_RSVD_MASK)); if (unlikely(mmu->pkr_mask)) u32 pkr_bits = kvm_mmu_pkr_bits(vcpu, mmu, pte_access, pte_pkey); errcode |= -pkr_bits & PFERR_PK_MASK; fault |= (pkr_bits != 0); } return -(u32)fault & errcode; permission_fault() is inline because it's heavily used for shadow paging, but when using TDP, it's far less performance critical. PKR is TDP-only, so moving it out-of-line should be totally ok (this is also why this patch is "unlikely").