[Bug 217562] kernel NULL pointer dereference on deletion of guest physical memory slot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=217562

--- Comment #2 from Arnaud Lefebvre (arnaud.lefebvre@xxxxxxxxxxxxxxxx) ---
Thanks a lot for that very detailed reply!

> TL;DR: I'm 99% certain you're hitting a race that results in KVM doing a
> list_del()
> before a list_add().  I am planning on sending a patch for v5.15 to disable
> the
> TDP MMU by default, which will "fix" this bug, but I have an extra long
> weekend
> and won't get to that before next Thursday or so.

> In the meantime, you can effect the same fix by disabling the TDP MMU via
> module
> param, i.e. add kvm.tdp_mmu=false to your kernel/KVM command line.

Alright, thanks for the tip. We'll probably just upgrade to the 6.1 LTS, this
was planned but we weren't sure if the bug were there too.

> If you're feeling particularly masochistic, I bet you could reproduce this
> more
> easily by introducing a delay between setting the SPTE and linking the page,
> e.g.
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 6c2bb60ccd88..1fb10d4156aa 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1071,6 +1071,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa,
> u32 error_code,
>                                                      !shadow_accessed_mask);
>  
>                         if (tdp_mmu_set_spte_atomic_no_dirty_log(vcpu->kvm,
>                         &iter, new_spte)) {
> +                               udelay(100);
>                                 tdp_mmu_link_page(vcpu->kvm, sp,
>                                                   huge_page_disallowed &&
>                                                   req_level >= iter.level);

We might try that if we can find some time in the upcoming weeks, just to be
sure that we can actually reproduce the bug and put this behind us.

Regarding this bug report, how do we proceed from now on? Should we close it?
Keep it open for a few weeks until we can confirm that we don't have this issue
in 6.1 anymore? Let you handle it once you disable TDP MMU by default on the
v5.15 LTS?

Thanks for your advice.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux