On 13/02/2023 13:50, Paolo Bonzini wrote: > > On 2/13/23 13:44, Jeremi Piotrowski wrote: >> Just built a kernel from that tree, and it displays the same behavior. The problem >> is not that the addresses are wrong, but that the flushes are issued at the wrong >> time now. At least for what "enlightened NPT TLB flush" requires. > > It is not clear to me why HvCallFluyshGuestPhysicalAddressSpace or HvCallFlushGuestPhysicalAddressList would have stricter requirements than a "regular" TLB shootdown using INVEPT. > > Can you clarify what you mean by wrong time, preferrably with some kind of sequence of events? > > That is, something like > > CPU 0 Modify EPT from ... to ... > CPU 0 call_rcu() to free page table > CPU 1 ... which is invalid because ... > CPU 0 HvCallFlushGuestPhysicalAddressSpace > > Paolo So I looked at the ftrace (all kvm&kvmmu events + hyperv_nested_* events) I see the following: With tdp_mmu=0: kvm_exit sequence of kvm_mmu_prepare_zap_page hyperv_nested_flush_guest_mapping (always follows every sequence of kvm_mmu_prepare_zap_page) kvm_entry With tdp_mmu=1 I see: kvm_mmu_prepare_zap_page and kvm_tdp_mmu_spte_changed events from a kworker context, but they are not followed by hyperv_nested_flush_guest_mapping. The only hyperv_nested_flush_guest_mapping events I see happen from the qemu process context. Also the number of flush hypercalls is significantly lower: a 7second sequence through OVMF with tdp_mmu=0 produces ~270 flush hypercalls. In the traces with tdp_mmu=1 I now see max 3. So this might be easier to diagnose than I thought: the HvCallFlushGuestPhysicalAddressSpace calls are missing now.