On 08/03/2016 10:19, Xiao Guangrong wrote: >>> >>> As PKEY is not enabled on softmmu, the gva_to_gpa mostly comes from >>> internal >>> KVM, that means we should always set PFEC.PKEY for all the gva_to_gpa >>> request. >>> Wasting a bit is really unnecessary. >>> >>> And it is always better to move more workload from permission_fault() to >>> update_permission_bitmask() as the former is much hotter than the >>> latter. >> >> I agree, but I'm not sure why you say that adding a bits adds more work >> to permission_fault(). > > A branch to check PFEC.PKEY, which is not well predictable on soft mmu. (It > should always be set in EPT as the page table walking is done by software, > however, if we only consider EPT we can assume it is always true). Ok, I agree we shouldn't add such a branch. It's really messy that PKERR_PK_MASK is computed at runtime. It means we need to check U=1 in permission_fault, and we need to rethink all of the handling of PKERR_PK_MASK in fact. I think we should extend the algorithm from earlier in the thread to handle the AD bit too: /* Bit 0: should PKRU.AD be considered if PFEC=0 or 1? * Bit 1: should PKRU.WD be considered if PFEC=0 or 1? * Bit 2: should PKRU.AD be considered if PFEC=2 or 3? * Bit 3: should PKRU.WD be considered if PFEC=2 or 3? * Ignore PKRU if U=0. * etc. */ u32 pkru_mask; In update_permission_bitmask, the bits of mmu->pkru_mask only depend on the byte, not on the bit. Bits 0, 2, ... are computed from pku, i.e. cr4_pku && long_mode && !ff. Bits 1, 3, ... are computed from pku && (wf && (is_write_protection(vcpu) || uf)). The permissions[] array can keep 16 entries, because PFERR_PK_MASK is never passed to permission_fault (like PFERR_RSVD_MASK). The permission_fault code is something like: WARN_ON(pfec & PFERR_PK_MASK); pkru_mask = pte_access & ACC_USER_MASK ? mmu->pkru_mask : 0; pkru_bits &= pkru_mask >> (pfec & ~1); pfec |= -pkru_bits & PFERR_PK_MASK; What do you think? I still believe that the above checks should be wrapped in a if (unlikely(mmu->pkru_mask) { ... } It should be predicted well and removes the potential performance cost of branchless code on !PKE OSes. >> Adding a bit lets us skip CR4.PKU and EFER.LMA checks in >> permission_fault() and in all gva_to_gpa() callers. > > The point is when we can clear this bit to skip these checks. We should > _always_ check PKEY even if PFEC.PKEY = 0, because: > 1) all gva_to_gpa()s issued by KVM should always check PKEY. This is the > case of ept only. > > 2) if the feature is enabled in softmmu, shadow page table may change its > behavior, for example, the mmio-access causes a reserved PF which > may clear PFEC.PKEY. > > And skipping these checks is not really necessary as we can take them into > account when we update the bitmask. Got it. Does the above pseudocode match your idea? It disregards the pagefault's PFERR_PK_MASK completely. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html