On Wed, Apr 17, 2024 at 05:31:08PM +0100, Matthew Auld wrote: > We flush the rebind worker during the vm close phase, however in places > like preempt_fence_work_func() we seem to queue the rebind worker > without first checking if the vm has already been closed. The concern > here is the vm being closed with the worker flushed, but then being > rearmed later, which looks like potential uaf, since there is no actual > refcounting to track the queued worker. To ensure this can't happen > prevent queueing the rebind worker once the vm has been closed. > > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1591 > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1304 > Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1249 > Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx> > Cc: Matthew Brost <matthew.brost@xxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v6.8+ > --- > drivers/gpu/drm/xe/xe_pt.c | 2 +- > drivers/gpu/drm/xe/xe_vm.h | 17 ++++++++++++++--- > 2 files changed, 15 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_pt.c b/drivers/gpu/drm/xe/xe_pt.c > index 5b7930f46cf3..e21461be904f 100644 > --- a/drivers/gpu/drm/xe/xe_pt.c > +++ b/drivers/gpu/drm/xe/xe_pt.c > @@ -1327,7 +1327,7 @@ __xe_pt_bind_vma(struct xe_tile *tile, struct xe_vma *vma, struct xe_exec_queue > } > if (!rebind && last_munmap_rebind && > xe_vm_in_preempt_fence_mode(vm)) > - xe_vm_queue_rebind_worker(vm); > + xe_vm_queue_rebind_worker_locked(vm); > } else { > kfree(rfence); > kfree(ifence); > diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h > index 306cd0934a19..8420fbf19f6d 100644 > --- a/drivers/gpu/drm/xe/xe_vm.h > +++ b/drivers/gpu/drm/xe/xe_vm.h > @@ -211,10 +211,20 @@ int xe_vm_rebind(struct xe_vm *vm, bool rebind_worker); > > int xe_vm_invalidate_vma(struct xe_vma *vma); > > -static inline void xe_vm_queue_rebind_worker(struct xe_vm *vm) > +static inline void xe_vm_queue_rebind_worker_locked(struct xe_vm *vm) > { > xe_assert(vm->xe, xe_vm_in_preempt_fence_mode(vm)); > - queue_work(vm->xe->ordered_wq, &vm->preempt.rebind_work); > + lockdep_assert_held(&vm->lock); > + > + if (!xe_vm_is_closed(vm)) xe_vm_is_closed_or_banned Otherwise LGTM. With the above changed: Reviewed-by: Matthew Brost <matthew.brost@xxxxxxxxx> > + queue_work(vm->xe->ordered_wq, &vm->preempt.rebind_work); > +} > + > +static inline void xe_vm_queue_rebind_worker(struct xe_vm *vm) > +{ > + down_read(&vm->lock); > + xe_vm_queue_rebind_worker_locked(vm); > + up_read(&vm->lock); > } > > /** > @@ -225,12 +235,13 @@ static inline void xe_vm_queue_rebind_worker(struct xe_vm *vm) > * If the rebind functionality on a compute vm was disabled due > * to nothing to execute. Reactivate it and run the rebind worker. > * This function should be called after submitting a batch to a compute vm. > + * > */ > static inline void xe_vm_reactivate_rebind(struct xe_vm *vm) > { > if (xe_vm_in_preempt_fence_mode(vm) && vm->preempt.rebind_deactivated) { > vm->preempt.rebind_deactivated = false; > - xe_vm_queue_rebind_worker(vm); > + xe_vm_queue_rebind_worker_locked(vm); > } > } > > -- > 2.44.0 >