We run into guest hang in edk2 firmware when KSM is kept as running on the host. The edk2 firmware is waiting for status 0x80 from QEMU's pflash device (TYPE_PFLASH_CFI01) during the operation for sector erasing or buffered write. The status is returned by reading the memory region of the pflash device and the read request should have been forwarded to QEMU and emulated by it. Unfortunately, the read request is covered by an illegal stage2 mapping when the guest hang issue occurs. The read request is completed with QEMU bypassed and wrong status is fetched. The illegal stage2 mapping is populated due to same page mering by KSM at (C) even the associated memory slot has been marked as invalid at (B). CPU-A CPU-B ----- ----- ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION) kvm_vm_ioctl_set_memory_region kvm_set_memory_region __kvm_set_memory_region kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE) kvm_invalidate_memslot kvm_copy_memslot kvm_replace_memslot kvm_swap_active_memslots (A) kvm_arch_flush_shadow_memslot (B) same page merging by KSM kvm_mmu_notifier_change_pte kvm_handle_hva_range __kvm_handle_hva_range (C) Fix the issue by skipping the invalid memory slot at (C) to avoid the illegal stage2 mapping. Without the illegal stage2 mapping, the read request for the pflash's status is forwarded to QEMU and emulated by it. The correct pflash's status can be returned from QEMU to break the infinite wait in edk2 firmware. Cc: stable@xxxxxxxxxxxxxxx # v5.13+ Fixes: 3039bcc74498 ("KVM: Move x86's MMU notifier memslot walkers to generic code") Reported-by: Shuai Hu <hshuai@xxxxxxxxxx> Reported-by: Zhenyu Zhang <zhenyzha@xxxxxxxxxx> Signed-off-by: Gavin Shan <gshan@xxxxxxxxxx> --- virt/kvm/kvm_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 479802a892d4..7f81a3a209b6 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -598,6 +598,9 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, unsigned long hva_start, hva_end; slot = container_of(node, struct kvm_memory_slot, hva_node[slots->node_idx]); + if (slot->flags & KVM_MEMSLOT_INVALID) + continue; + hva_start = max(range->start, slot->userspace_addr); hva_end = min(range->end, slot->userspace_addr + (slot->npages << PAGE_SHIFT)); -- 2.23.0