On Wed, Jul 31, 2024 at 5:21 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > It's not even remotely close to 100 instructions. It's not even 10 instructions. > It's 3 instructions, and maybe two uops? Well yeah, I meant 100 instructions over the whole execution of the VM... Paolo > Modern compilers are smart enough to optimize usage of kvm_mmu_commit_zap_page() > so that the caller inlines the list_empty(invalid_list) check, but the guts of > the zap code are non-inlined. > > So, as is, the generated code is: > > 0x00000000000599a7 <+55>: mov 0x8d40(%r12),%rbp > 0x00000000000599af <+63>: cmp %rbp,%r15 > 0x00000000000599b2 <+66>: mov 0x8(%rbp),%rbx > 0x00000000000599b6 <+70>: je 0x599d6 <kvm_zap_obsolete_pages+102> > > 0x00000000000599d6 <+102>: mov 0x8d48(%r12),%rax > 0x00000000000599de <+110>: cmp %r14,%rax > 0x00000000000599e1 <+113>: je 0x59a5f <kvm_zap_obsolete_pages+239> > > 0x0000000000059a5f <+239>: mov 0x8(%rsp),%rax > 0x0000000000059a64 <+244>: sub %gs:0x28,%rax > 0x0000000000059a6d <+253>: jne 0x59a86 <kvm_zap_obsolete_pages+278> > 0x0000000000059a6f <+255>: add $0x10,%rsp > 0x0000000000059a73 <+259>: pop %rbx > 0x0000000000059a74 <+260>: pop %rbp > 0x0000000000059a75 <+261>: pop %r12 > 0x0000000000059a77 <+263>: pop %r13 > 0x0000000000059a79 <+265>: pop %r14 > 0x0000000000059a7b <+267>: pop %r15 > 0x0000000000059a7d <+269>: ret > > and adding an extra list_empty(kvm->arch.active_mmu_pages) generates: > > 0x000000000005999a <+42>: mov 0x8d38(%rdi),%rax > 0x00000000000599a1 <+49>: cmp %rax,%r15 > 0x00000000000599a4 <+52>: je 0x59a6f <kvm_zap_obsolete_pages+255> > > 0x0000000000059a6f <+255>: mov 0x8(%rsp),%rax > 0x0000000000059a74 <+260>: sub %gs:0x28,%rax > 0x0000000000059a7d <+269>: jne 0x59a96 <kvm_zap_obsolete_pages+294> > 0x0000000000059a7f <+271>: add $0x10,%rsp > 0x0000000000059a83 <+275>: pop %rbx > 0x0000000000059a84 <+276>: pop %rbp > 0x0000000000059a85 <+277>: pop %r12 > 0x0000000000059a87 <+279>: pop %r13 > 0x0000000000059a89 <+281>: pop %r14 > 0x0000000000059a8b <+283>: pop %r15 > 0x0000000000059a8d <+285>: ret > > i.e. it elides the list_empty(invalid_list) check, that's it. >