Re: [PATCH v6 1/4] KVM: mmu: introduce new gfn_to_pfn_noref functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 24, 2023, Peter Xu wrote:
> On Wed, May 24, 2023 at 09:46:13AM -0700, Sean Christopherson wrote:
> > If we hack kvm_pfn_to_refcounted_page(), then all of those protections are lost
> > because KVM would drop its assertions and also skip dirtying pages, i.e. would
> > effectively suppress the latent detection by check_new_page_bad().
> 
> So it's probably that I totally have no idea what are the attributes for
> those special pages so I don't understand enough on why we need to handle
> those pages differently from e.g. PFNMAP pages, and also the benefits.
> 
> I think what I can tell is that they're pages that doesn't have
> PageCompound bits set on either head or tails, however it's still a
> multi-2-order large page.  Is there an example on how these pages are used
> and allocated?  Why would we need those pages, and whether these pages need
> to be set dirty/accessed after all?

The use case David is interested in is where an AMD GPU driver kmallocs() a
chunk of memory, let's it be mmap()'d by userspace, and userspace then maps it
into the guest for a virtual (passthrough?) GPU.  For all intents and purposes,
it's normal memory, just not refcounted.

> >  static bool kvm_is_ad_tracked_page(struct page *page)
> >  {
> > +       /*
> > +        * Assert that KVM isn't attempting to mark a freed page as Accessed or
> > +        * Dirty, i.e. that KVM's MMU doesn't have a use-after-free bug.  KVM
> > +        * (typically) doesn't pin pages that are mapped in KVM's MMU, and
> > +        * instead relies on mmu_notifiers to know when a mapping needs to be
> > +        * zapped/invalidated.  Unmapping from KVM's MMU must happen _before_
> > +        * KVM returns from its mmu_notifier, i.e. the page should have an
> > +        * elevated refcount at this point even though KVM doesn't hold a
> > +        * reference of its own.
> > +        */
> > +       if (WARN_ON_ONCE(!page_count(page)))
> > +               return false;
> > +
> >         /*
> >          * Per page-flags.h, pages tagged PG_reserved "should in general not be
> >          * touched (e.g. set dirty) except by its owner".
> > 
> 
> This looks like a good thing to have, indeed.  But again it doesn't seem
> like anything special to the pages we're discussing here, say, !Compound &&
> refcount==0 ones.

The problem is that if KVM ignores refcount==0 pages, then KVM can't distinguish
between the legitimate[*] refcount==0 AMD GPU case and a buggy refcount==0
use-after-free scenario.  I don't want to make that sacrifice as the legimiate
!refcounted use case is a very specific use case, whereas consuming refcounted
memory is ubiquituous (outside of maybe AWS).

[*] Consuming !refcounted pages is safe only for flows that are tied into the
    mmu_notifiers.  The current proposal/plan is to add an off-by-default module
    param that let's userspace opt-in to kmap() use of !refcounted memory, e.g.
    this case and PFNMAP memory.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux