Re: [PATCH] KVM: pfncache: rework __kvm_gpc_refresh() to fix locking issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 12, 2024, David Woodhouse wrote:
> This function can race with kvm_gpc_deactivate(). Since that function
> does not take the ->refresh_lock, it can wipe and unmap the pfn and
> khva while hva_to_pfn_retry() has dropped its write lock on gpc->lock.
> 
> Then if hva_to_pfn_retry() determines that the PFN hasn't changed and
> that it can re-use the old pfn and khva, they get assigned back to
> gpc->pfn and gpc->khva even though the khva was already unmapped by
> kvm_gpc_deactivate(). This leaves the cache in an apparently valid
> state but with ->khva pointing to an address which has been unmapped.
> Which in turn leads to oopses in e.g. __kvm_xen_has_interrupt() and
> set_shinfo_evtchn_pending().
> 
> It may be possible to fix this just by making kvm_gpc_deactivate()
> take the ->refresh_lock, but that still leaves ->refresh_lock being
> basically redundant with the write lock on ->lock, which frankly
> makes my skin itch, with the way that pfn_to_hva_retry() operates on
> fields in the gpc without holding ->lock.
> 
> Instead, fix it by cleaning up the semantis of hva_to_pfn_retry(). It
> no longer operates on the gpc object at all; it's called with a uhva
> and returns the corresponding pfn (pinned), and a mapped khva for it.
> 
> The calling function __kvm_gpc_refresh() now drops ->lock before calling
> hva_to_pfn_retry(), then retakes the lock before checking for changes,
> and discards the new mapping if it lost a race. And will correctly
> note the old pfn/khva to be unmapped at the right time, instead of
> preserving them in a local variable while dropping the lock.
> 
> The optimisation in hva_to_pfn_retry() where it attempts to use the
> old mapping if the pfn doesn't change is dropped, since it makes the
> pinning more complex. It's a pointless optimisation anyway, since the
> odds of the pfn ending up the same when the uhva has changed (i.e.
> the odds of the two userspace addresses both pointing to the same
> underlying physical page) are negligible,
> 
> I remain slightly confused because although this is clearly a race in
> the gfn_to_pfn_cache code, I don't quite know how the Xen support code
> actually managed to trigger it. We've seen oopses from dereferencing a
> valid-looking ->khva in both __kvm_xen_has_interrupt() (the vcpu_info)
> and in set_shinfo_evtchn_pending() (the shared_info). But surely the
> race shouldn't happen for the vcpu_info gpc because all calls to both
> refresh and deactivate hold the vcpu mutex, and it shouldn't happen

FWIW, neither kvm_xen_destroy_vcpu() nor kvm_xen_destroy_vm() holds the appropriate
mutex.  

> for the shared_info gpc because all calls to both will hold the
> kvm->arch.xen.xen_lock mutex.
>
> Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> ---
> 
> This is based on (and in) my tree at
> https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/xenfv
> which has all the other outstanding KVM/Xen fixes.
> 
>  virt/kvm/pfncache.c | 181 +++++++++++++++++++++-----------------------
>  1 file changed, 85 insertions(+), 96 deletions(-)

NAK, at least as a bug fix.  We've already shuffled deck chairs on the Titanic
several times, I have zero confidence that doing so one more time is going to
truly solve the underlying mess.

The contract with the gfn_to_pfn_cache, or rather the lack thereof, is all kinds
of screwed up.  E.g. I added the mutex in commit 93984f19e7bc ("KVM: Fully serialize
gfn=>pfn cache refresh via mutex") to guard against concurrent unmap(), but the
unmap() API has since been removed.  We need to define an actual contract instead
of continuing to throw noodles as the wall in the hope that something sticks.

As you note above, some other mutex _should_ be held.  I think we should lean
into that.  E.g.

  1. Pass in the guarding mutex to kvm_gpc_init() and assert that said mutex is
     held for __refresh(), activate(), and deactivate().
  2. Fix the cases where that doesn't hold true.
  3. Drop refresh_mutex





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux