On 12/11/21 00:04, Ignat Korchagin wrote:
That is, I never get a stack with
kvm_tdp_mmu_put_root->..->kvm_set_pfn_dirty with a "good" run.
Perhaps, this may shed some light onto what is going on.
Maybe not kvm_tdp_mmu_put_root->...->kvm_set_pfn_dirty per se, but
do_exit->kvm_tdp_mmu_put_root->...->kvm_set_pfn_dirty seems to be
part of the problem.
Both kvm_set_pfn_dirty and kvm_set_pfn_accessed, which is where
execution really goes in the weeds, have this conditional:
if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
...
And indeed kvm_is_zone_device_pfn(pfn) returns false if the WARN_ON_ONCE
fires. What happens is that the page has already been released by the
process's exiting, so it has no A/D tracking anymore. But the conditional
is true and bad things happen in workingset_activation: while
!page_count(pfn_to_page(pfn)) is definitely not a ZONE_DEVICE page,
it's _also_ not a page that should be marked dirty or accessed.
Something like the following, while completely wrong or at least nothing
more than a bandaid, should at least avoid the worst consequences of the
bug:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 168d0ab93c88..699455715699 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -176,6 +176,14 @@ bool kvm_is_zone_device_pfn(kvm_pfn_t pfn)
return is_zone_device_page(pfn_to_page(pfn));
}
+static inline bool kvm_pfn_has_accessed_dirty(kvm_pfn_t pfn)
+{
+ if (!pfn_valid(pfn) || !page_count(pfn_to_page(pfn)))
+ return false;
+
+ return !PageReserved(pfn_to_page(pfn)) || is_zero_pfn(pfn);
+}
+
bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
{
/*
@@ -2812,14 +2820,14 @@ EXPORT_SYMBOL_GPL(kvm_release_pfn_dirty);
void kvm_set_pfn_dirty(kvm_pfn_t pfn)
{
- if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
+ if (kvm_pfn_has_accessed_dirty(pfn))
SetPageDirty(pfn_to_page(pfn));
}
EXPORT_SYMBOL_GPL(kvm_set_pfn_dirty);
void kvm_set_pfn_accessed(kvm_pfn_t pfn)
{
- if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
+ if (kvm_pfn_has_accessed_dirty(pfn))
mark_page_accessed(pfn_to_page(pfn));
}
EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
The real question is why kvm_mmu_free_roots is finding some dirty pages
in the do_exit->exit_files->...->close_files path, well after exit_mm()
has finished running. I'm not sure how kvm_mmu_zap_all could leave
something behind.
Th might be completely off track, but maybe it helps someone fixing
the bug while I get some sleep.
Paolo