On Tue, Jan 26, 2021, Paolo Bonzini wrote: > On 30/09/20 06:36, Sean Christopherson wrote: > > > CR4.PKS is not in the list of CR4 bits that result in a PDPTE load. > > > Since it has no effect on PAE paging, I would be surprised if it did > > > result in a PDPTE load. > > It does belong in the mmu_role_bits though;-) > > > > Does it? We don't support PKU/PKS for shadow paging, and it's always zero > for EPT. We only support enough PKU/PKS for emulation. As proposed, yes. The PKU/PKS mask is tracked on a per-mmu basis, e.g. computed in update_pkr_bitmask() and consumed in permission_fault() during emulation. Omitting CR4.PKS from the extended role could let KVM reuse an MMU with the wrong pkr_mask. That being said, I don't think it needs a dedicated bit. IIUC, the logic is PKU|PKS, with pkr_mask generation being PKU vs. PKS agnostic. The role could do the same and smush the bits together, e.g. diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3d6616f6f6ef..3bfca34f6ea2 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -293,7 +293,7 @@ union kvm_mmu_extended_role { unsigned int cr0_pg:1; unsigned int cr4_pae:1; unsigned int cr4_pse:1; - unsigned int cr4_pke:1; + unsigned int cr4_pkr:1; unsigned int cr4_smap:1; unsigned int cr4_smep:1; unsigned int maxphyaddr:6; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 79166288ed03..2774b85a36d5 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4448,7 +4448,8 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu) ext.cr4_smep = !!kvm_read_cr4_bits(vcpu, X86_CR4_SMEP); ext.cr4_smap = !!kvm_read_cr4_bits(vcpu, X86_CR4_SMAP); ext.cr4_pse = !!is_pse(vcpu); - ext.cr4_pke = !!kvm_read_cr4_bits(vcpu, X86_CR4_PKE); + ext.cr4_pkr = !!kvm_read_cr4_bits(vcpu, X86_CR4_PKE) || + !!kvm_read_cr4_bits(vcpu, X86_CR4_PKS); ext.maxphyaddr = cpuid_maxphyaddr(vcpu); ext.valid = 1; Another option would be to move the tracking out of the MMU, e.g. make pkr_mask per-vCPU and recalculate when CR4 changes. I think that would "just work", even when nested VMs are in play?