Re: [PATCH 3/3] KVM: gmem: track preparedness a page at a time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 13, 2024, Sean Christopherson wrote:
> On Fri, Nov 08, 2024, Paolo Bonzini wrote:
> >  static void kvm_gmem_mark_prepared(struct file *file, pgoff_t index, struct folio *folio)
> >  {
> > -	folio_mark_uptodate(folio);
> > +	struct kvm_gmem_inode *i_gmem = (struct kvm_gmem_inode *)file->f_inode->i_private;
> > +	unsigned long *p = i_gmem->prepared + BIT_WORD(index);
> > +	unsigned long npages = folio_nr_pages(folio);
> > +
> > +	/* Folios must be naturally aligned */
> > +	WARN_ON_ONCE(index & (npages - 1));
> > +	index &= ~(npages - 1);
> > +
> > +	/* Clear page before updating bitmap.  */
> > +	smp_wmb();
> > +
> > +	if (npages < BITS_PER_LONG) {
> > +		bitmap_set_atomic_word(p, index, npages);
> > +	} else {
> > +		BUILD_BUG_ON(BITS_PER_LONG != 64);
> > +		memset64((u64 *)p, ~0, BITS_TO_LONGS(npages));
> 
> More code that could be deduplicated (unprepared path below).
> 
> But more importantly, I'm pretty sure the entire lockless approach is unsafe.
> 
> Callers of kvm_gmem_get_pfn() do not take any locks that protect kvm_gmem_mark_prepared()
> from racing with kvm_gmem_mark_range_unprepared().  kvm_mmu_invalidate_begin()
> prevents KVM from installing a stage-2 mapping, i.e. from consuming the PFN, but
> it doesn't prevent kvm_gmem_mark_prepared() from being called.  And
> sev_handle_rmp_fault() never checks mmu_notifiers, which is probably fine?  But
> sketchy.
> 
> Oof.  Side topic.  sev_handle_rmp_fault() is suspect for other reasons.  It does
> its own lookup of the PFN, and so could see an entirely different PFN than was
> resolved by kvm_mmu_page_fault().  And thanks to KVM's wonderful 0/1/-errno
> behavior, sev_handle_rmp_fault() is invoked even when KVM wants to retry the
> fault, e.g. due to an active MMU invalidation.
> 
> Anyways, KVM wouldn't _immediately_ consume bad data, as the invalidation
> would block the current page fault.  But clobbering i_gmem->prepared would result
> in a missed "prepare" phase if the hole-punch region were restored.
> 
> One idea would be to use a rwlock to protect updates to the bitmap (setters can
> generally stomp on each other).  And to avoid contention whenever possible in
> page fault paths, only take the lock if the page is not up-to-date, because again
> kvm_mmu_invalidate_{begin,end}() protects against UAF, it's purely updates to the
> bitmap that need extra protection.

Actually, there's no point in having a rwlock, because readers are serialized on
the folio's lock.  And KVM absolutely relies on that already, because otherwise
multiple vCPUs could see an unprepared folio, and clear_highpage() could end up
racing with writes from the vCPU.

> Note, using is mmu_invalidate_retry_gfn() is unsafe, because it must be called
> under mmu_lock to ensure it doesn't get false negatives.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux