Re: [PATCH 1/3] drm/xe/vm: prevent UAF with asid based lookup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 12, 2024 at 03:42:00PM +0100, Matthew Auld wrote:
> On 12/04/2024 15:06, Lucas De Marchi wrote:
> > On Fri, Apr 12, 2024 at 12:31:45PM +0100, Matthew Auld wrote:
> > > The asid is only erased from the xarray when the vm refcount reaches
> > > zero, however this leads to potential UAF since the xe_vm_get() only
> > 
> > I'm not sure I understand the call chain an where xe_vm_get() is coming
> > into play here.
> > 
> > 
> > > works on a vm with refcount != 0. Since the asid is allocated in the vm
> > > create ioctl, rather erase it when closing the vm, prior to dropping the
> > > potential last ref. This should also work when user closes driver fd
> > > without explicit vm destroy.
> > 
> > what seems weird is that you are moving it earlier in the call stack
> > rather than later, outside of the worker, to prevent the UAF.
> > 
> > what exactly was the UAF on?
> 
> UAF on the vm object. From the bug report it's when servicing some GPU
> fault, so inside handle_pagefault(). At this stage it just has some asid
> which is meant to map to some vm AFAICT. The lookup dance relies on calling
> xe_vm_get() after getting back the vm pointer from the xarray. Currently the
> asid is only erased from the xarray in vm_destroy_work_func() which is long
> after the refcount reaches zero and we are about to free the memory for the
> vm.
> 
> However xe_vm_get() is only meant to be called if you are already holding a
> ref, so if the vm refcount is already zero it just throws an error and
> continues on, and the caller has no idea. If that happens then as soon as we
> drop the usm lock the memory can be freed and it's game over. This looks to
> be what happens with the vm refcount reaching zero, and the
> handle_pagefault() still being able to reach the soon-to-be-freed vm via the
> xarray. With this patch we now erase from the xarray before we drop what is
> potentially the final ref. That way if you can reach the vm via the xarray
> you should always be able get a valid ref.
> 

This explaination makes sense to me. Thanks for the fix.

Reviewed-by: Matthew.brost@xxxxxxxxx>

> > 
> > Lucas De Marchi
> > 
> > > 
> > > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> > > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1594
> > > Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx>
> > > Cc: Matthew Brost <matthew.brost@xxxxxxxxx>
> > > Cc: <stable@xxxxxxxxxxxxxxx> # v6.8+
> > > ---
> > > drivers/gpu/drm/xe/xe_vm.c | 21 +++++++++++----------
> > > 1 file changed, 11 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > > index a196dbe65252..c5c26b3d1b76 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.c
> > > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > > @@ -1581,6 +1581,16 @@ void xe_vm_close_and_put(struct xe_vm *vm)
> > >         xe->usm.num_vm_in_fault_mode--;
> > >     else if (!(vm->flags & XE_VM_FLAG_MIGRATION))
> > >         xe->usm.num_vm_in_non_fault_mode--;
> > > +
> > > +    if (vm->usm.asid) {
> > > +        void *lookup;
> > > +
> > > +        xe_assert(xe, xe->info.has_asid);
> > > +        xe_assert(xe, !(vm->flags & XE_VM_FLAG_MIGRATION));
> > > +
> > > +        lookup = xa_erase(&xe->usm.asid_to_vm, vm->usm.asid);
> > > +        xe_assert(xe, lookup == vm);
> > > +    }
> > >     mutex_unlock(&xe->usm.lock);
> > > 
> > >     for_each_tile(tile, xe, id)
> > > @@ -1596,24 +1606,15 @@ static void vm_destroy_work_func(struct
> > > work_struct *w)
> > >     struct xe_device *xe = vm->xe;
> > >     struct xe_tile *tile;
> > >     u8 id;
> > > -    void *lookup;
> > > 
> > >     /* xe_vm_close_and_put was not called? */
> > >     xe_assert(xe, !vm->size);
> > > 
> > >     mutex_destroy(&vm->snap_mutex);
> > > 
> > > -    if (!(vm->flags & XE_VM_FLAG_MIGRATION)) {
> > > +    if (!(vm->flags & XE_VM_FLAG_MIGRATION))
> > >         xe_device_mem_access_put(xe);
> > > 
> > > -        if (xe->info.has_asid && vm->usm.asid) {
> > > -            mutex_lock(&xe->usm.lock);
> > > -            lookup = xa_erase(&xe->usm.asid_to_vm, vm->usm.asid);
> > > -            xe_assert(xe, lookup == vm);
> > > -            mutex_unlock(&xe->usm.lock);
> > > -        }
> > > -    }
> > > -
> > >     for_each_tile(tile, xe, id)
> > >         XE_WARN_ON(vm->pt_root[id]);
> > > 
> > > -- 
> > > 2.44.0
> > > 




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux