Use a basic NOT+AND sequence to clear the Accessed bit in TDP MMU SPTEs, as opposed to the fancy ffs()+clear_bit() logic that was copied from the legacy MMU. The legacy MMU uses clear_bit() because it is operating on the SPTE itself, i.e. clearing needs to be atomic. The TDP MMU operates on a local variable that it later writes to the SPTE, and so doesn't need to be atomic or even resident in memory. Opportunistically drop unnecessary initialization of new_spte, it's guaranteed to be written before being accessed. Using NOT+AND instead of ffs()+clear_bit() reduces the sequence from: 0x0000000000058be6 <+134>: test %rax,%rax 0x0000000000058be9 <+137>: je 0x58bf4 <age_gfn_range+148> 0x0000000000058beb <+139>: test %rax,%rdi 0x0000000000058bee <+142>: je 0x58cdc <age_gfn_range+380> 0x0000000000058bf4 <+148>: mov %rdi,0x8(%rsp) 0x0000000000058bf9 <+153>: mov $0xffffffff,%edx 0x0000000000058bfe <+158>: bsf %eax,%edx 0x0000000000058c01 <+161>: movslq %edx,%rdx 0x0000000000058c04 <+164>: lock btr %rdx,0x8(%rsp) 0x0000000000058c0b <+171>: mov 0x8(%rsp),%r15 to: 0x0000000000058bdd <+125>: test %rax,%rax 0x0000000000058be0 <+128>: je 0x58beb <age_gfn_range+139> 0x0000000000058be2 <+130>: test %rax,%r8 0x0000000000058be5 <+133>: je 0x58cc0 <age_gfn_range+352> 0x0000000000058beb <+139>: not %rax 0x0000000000058bee <+142>: and %r8,%rax 0x0000000000058bf1 <+145>: mov %rax,%r15 thus eliminating several memory accesses, including a locked access. Cc: Ben Gardon <bgardon@xxxxxxxxxx> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx> --- arch/x86/kvm/mmu/tdp_mmu.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 724088bea4b0..161b77925a19 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -951,7 +951,7 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, { struct tdp_iter iter; int young = 0; - u64 new_spte = 0; + u64 new_spte; rcu_read_lock(); @@ -966,8 +966,7 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot, new_spte = iter.old_spte; if (spte_ad_enabled(new_spte)) { - clear_bit((ffs(shadow_accessed_mask) - 1), - (unsigned long *)&new_spte); + new_spte &= ~shadow_accessed_mask; } else { /* * Capture the dirty status of the page, so that it doesn't get -- 2.31.0.291.g576ba9dcdaf-goog