On 05/12/19 20:30, Peter Xu wrote: >> Try enabling kvmmmu tracepoints too, it will tell >> you more of the path that was taken while processing the EPT violation. > > These new tracepoints are extremely useful (which I didn't notice > before). Yes, they are! > So here's the final culprit... > > void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot, u64 offset, u64 mask) > { > ... > spin_lock(&kvm->mmu_lock); > /* FIXME: we should use a single AND operation, but there is no > * applicable atomic API. > */ > while (mask) { > clear_bit_le(offset + __ffs(mask), memslot->dirty_bitmap); > mask &= mask - 1; > } > > kvm_arch_mmu_enable_log_dirty_pt_masked(kvm, memslot, offset, mask); > spin_unlock(&kvm->mmu_lock); > } > > The mask is cleared before reaching > kvm_arch_mmu_enable_log_dirty_pt_masked().. I'm not sure why that results in two vmexits? (clearing before kvm_arch_mmu_enable_log_dirty_pt_masked is also what KVM_{GET,CLEAR}_DIRTY_LOG does). > The funny thing is that I did have a few more patches to even skip > allocate the dirty_bitmap when dirty ring is enabled (hence in that > tree I removed this while loop too, so that has no such problem). > However I dropped those patches when I posted the RFC because I don't > think it's mature, and the selftest didn't complain about that > either.. Though, I do plan to redo that in v2 if you don't disagree. > The major question would be whether the dirty_bitmap could still be > for any use if dirty ring is enabled. Userspace may want a dirty bitmap in addition to a list (for example: list for migration, bitmap for framebuffer update), but it can also do a pass over the dirty rings in order to update an internal bitmap. So I think it make sense to make it either one or the other. Paolo