Re: Potential bug in TDP MMU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/11/21 00:04, Ignat Korchagin wrote:
That is, I never get a stack with
kvm_tdp_mmu_put_root->..->kvm_set_pfn_dirty with a "good" run.
Perhaps, this may shed some light onto what is going on.

Maybe not kvm_tdp_mmu_put_root->...->kvm_set_pfn_dirty per se, but
do_exit->kvm_tdp_mmu_put_root->...->kvm_set_pfn_dirty seems to be
part of the problem.

Both kvm_set_pfn_dirty and kvm_set_pfn_accessed, which is where
execution really goes in the weeds, have this conditional:

	if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
		...

And indeed kvm_is_zone_device_pfn(pfn) returns false if the WARN_ON_ONCE
fires.  What happens is that the page has already been released by the
process's exiting, so it has no A/D tracking anymore.  But the conditional
is true and bad things happen in workingset_activation: while
!page_count(pfn_to_page(pfn)) is definitely not a ZONE_DEVICE page,
it's _also_ not a page that should be marked dirty or accessed.

Something like the following, while completely wrong or at least nothing
more than a bandaid, should at least avoid the worst consequences of the
bug:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 168d0ab93c88..699455715699 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -176,6 +176,14 @@ bool kvm_is_zone_device_pfn(kvm_pfn_t pfn)
 	return is_zone_device_page(pfn_to_page(pfn));
 }
+static inline bool kvm_pfn_has_accessed_dirty(kvm_pfn_t pfn)
+{
+	if (!pfn_valid(pfn) || !page_count(pfn_to_page(pfn)))
+		return false;
+
+	return !PageReserved(pfn_to_page(pfn)) || is_zero_pfn(pfn);
+}
+
 bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
 {
 	/*
@@ -2812,14 +2820,14 @@ EXPORT_SYMBOL_GPL(kvm_release_pfn_dirty);
void kvm_set_pfn_dirty(kvm_pfn_t pfn)
 {
-	if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
+	if (kvm_pfn_has_accessed_dirty(pfn))
 		SetPageDirty(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
void kvm_set_pfn_accessed(kvm_pfn_t pfn)
 {
-	if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
+	if (kvm_pfn_has_accessed_dirty(pfn))
 		mark_page_accessed(pfn_to_page(pfn));
 }
 EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);

The real question is why kvm_mmu_free_roots is finding some dirty pages
in the do_exit->exit_files->...->close_files path, well after exit_mm()
has finished running.  I'm not sure how kvm_mmu_zap_all could leave
something behind.

Th might be completely off track, but maybe it helps someone fixing
the bug while I get some sleep.

Paolo



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux