[PATCH] KVM: use set_page_dirty rather than SetPageDirty

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At Facebook, we have hit a bug in an interaction between KVM and btrfs
running android emulators in a build/test environment using the android
emulator's ram snapshot features (-snapshot and -snapstorage). The
important aspect of those features is that they result in qemu mmap-ing
a file rather than anonymous memory for the guest's memory.

Ultimately, we observe (with drgn) pages of the mapped file stuck in
btrfs writeback because the mapping's xarray lacks the expected dirty
tags. I have not yet been able to pin down the exact vm behavior that
results in these bad kvm_set_pfn_dirty calls, but I caught them by
instrumenting SetPageDirty with a warning, getting a stack trace like:

RIP: 0010:kvm_set_pfn_dirty+0xaf/0xd0 [kvm]
<snip>
 Call Trace:
  kvm_release_pfn+0x2d/0x40 [kvm]
  __kvm_map_gfn+0x115/0x2b0 [kvm]
  kvm_arch_vcpu_ioctl_run+0x1538/0x1b30 [kvm]
  ? call_function_interrupt+0xa/0x20
  kvm_vcpu_ioctl+0x232/0x5e0 [kvm]
  ksys_ioctl+0x83/0xc0
  __x64_sys_ioctl+0x16/0x20
  do_syscall_64+0x42/0x110
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

kvm_arch_vcpu_ioctl_run+0x1538 is the call to complete_userspace_io on
line 8728, for what it's worth. I also confirmed that the page being
dirtied in this codepath is the one we end up stuck on.

This is on a kernel based off of 5.6, but as far as I can tell, the
behavior in KVM is still incorrect currently, as it doesn't account for
file backed pages.

I tested this fix on the workload and it did prevent the hangs. However,
I am unsure if the fix is appropriate from a locking perspective, so I
hope to draw some extra attention to that aspect. set_page_dirty_lock in
mm/page-writeback.c has a comment about locking that says set_page_dirty
should be called with the page locked or while definitely holding a
reference to the mapping's host inode. I believe that the mmap should
have that reference, so for fear of hurting KVM performance or
introducing a deadlock, I opted for the unlocked variant.

Signed-off-by: Boris Burkov <boris@xxxxxx>
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2755ba4177d6..432c109664c3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2811,7 +2811,7 @@ EXPORT_SYMBOL_GPL(kvm_release_pfn_dirty);
 void kvm_set_pfn_dirty(kvm_pfn_t pfn)
 {
 	if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
-		SetPageDirty(pfn_to_page(pfn));
+		set_page_dirty(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
 
-- 
2.34.0




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux