Re: [PATCH v2] KVM: Avoid illegal stage2 mapping on invalid memory slot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09.06.23 12:04, Gavin Shan wrote:
We run into guest hang in edk2 firmware when KSM is kept as running on
the host. The edk2 firmware is waiting for status 0x80 from QEMU's pflash
device (TYPE_PFLASH_CFI01) during the operation of sector erasing or
buffered write. The status is returned by reading the memory region of
the pflash device and the read request should have been forwarded to QEMU
and emulated by it. Unfortunately, the read request is covered by an
illegal stage2 mapping when the guest hang issue occurs. The read request
is completed with QEMU bypassed and wrong status is fetched. The edk2
firmware runs into an infinite loop with the wrong status.

The illegal stage2 mapping is populated due to same page sharing by KSM
at (C) even the associated memory slot has been marked as invalid at (B)
when the memory slot is requested to be deleted. It's notable that the
active and inactive memory slots can't be swapped when we're in the middle
of kvm_mmu_notifier_change_pte() because kvm->mn_active_invalidate_count
is elevated, and kvm_swap_active_memslots() will busy loop until it reaches
to zero again. Besides, the swapping from the active to the inactive memory
slots is also avoided by holding &kvm->srcu in __kvm_handle_hva_range(),
corresponding to synchronize_srcu_expedited() in kvm_swap_active_memslots().

   CPU-A                    CPU-B
   -----                    -----
                            ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION)
                            kvm_vm_ioctl_set_memory_region
                            kvm_set_memory_region
                            __kvm_set_memory_region
                            kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE)
                              kvm_invalidate_memslot
                                kvm_copy_memslot
                                kvm_replace_memslot
                                kvm_swap_active_memslots        (A)
                                kvm_arch_flush_shadow_memslot   (B)
   same page sharing by KSM
   kvm_mmu_notifier_invalidate_range_start
         :
   kvm_mmu_notifier_change_pte
     kvm_handle_hva_range
     __kvm_handle_hva_range       (C)
         :
   kvm_mmu_notifier_invalidate_range_end

Fix the issue by skipping the invalid memory slot at (C) to avoid the
illegal stage2 mapping so that the read request for the pflash's status
is forwarded to QEMU and emulated by it. In this way, the correct pflash's
status can be returned from QEMU to break the infinite loop in the edk2
firmware.

Cc: stable@xxxxxxxxxxxxxxx # v5.13+
Fixes: 3039bcc74498 ("KVM: Move x86's MMU notifier memslot walkers to generic code")
Reported-by: Shuai Hu <hshuai@xxxxxxxxxx>
Reported-by: Zhenyu Zhang <zhenyzha@xxxxxxxxxx>
Signed-off-by: Gavin Shan <gshan@xxxxxxxxxx>
---
v2: Improved changelog suggested by Marc
---
  virt/kvm/kvm_main.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 479802a892d4..7f81a3a209b6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -598,6 +598,9 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm,
  			unsigned long hva_start, hva_end;
slot = container_of(node, struct kvm_memory_slot, hva_node[slots->node_idx]);
+			if (slot->flags & KVM_MEMSLOT_INVALID)
+				continue;
+
  			hva_start = max(range->start, slot->userspace_addr);
  			hva_end = min(range->end, slot->userspace_addr +
  						  (slot->npages << PAGE_SHIFT));

Nice debugging!

LGTM

Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>

--
Cheers,

David / dhildenb




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux