On Fri, Feb 14, 2025 at 05:05:28PM +0000, Matthew Auld wrote: > On error restore anything still on the pin_list back to the invalidation > list on error. For the actual pin, so long as the vma is tracked on > either list it should get picked up on the next pin, however it looks > possible for the vma to get nuked but still be present on this per vm > pin_list leading to corruption. An alternative might be then to instead > just remove the link when destroying the vma. > > Fixes: ed2bdf3b264d ("drm/xe/vm: Subclass userptr vmas") > Suggested-by: Matthew Brost <matthew.brost@xxxxxxxxx> > Signed-off-by: Matthew Auld <matthew.auld@xxxxxxxxx> > Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # v6.8+ > --- > drivers/gpu/drm/xe/xe_vm.c | 26 +++++++++++++++++++------- > 1 file changed, 19 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c > index d664f2e418b2..668b0bde7822 100644 > --- a/drivers/gpu/drm/xe/xe_vm.c > +++ b/drivers/gpu/drm/xe/xe_vm.c > @@ -670,12 +670,12 @@ int xe_vm_userptr_pin(struct xe_vm *vm) > list_for_each_entry_safe(uvma, next, &vm->userptr.invalidated, > userptr.invalidate_link) { > list_del_init(&uvma->userptr.invalidate_link); > - list_move_tail(&uvma->userptr.repin_link, > - &vm->userptr.repin_list); > + list_add_tail(&uvma->userptr.repin_link, > + &vm->userptr.repin_list); Why this change? > } > spin_unlock(&vm->userptr.invalidated_lock); > > - /* Pin and move to temporary list */ > + /* Pin and move to bind list */ > list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list, > userptr.repin_link) { > err = xe_vma_userptr_pin_pages(uvma); > @@ -691,10 +691,10 @@ int xe_vm_userptr_pin(struct xe_vm *vm) > err = xe_vm_invalidate_vma(&uvma->vma); > xe_vm_unlock(vm); > if (err) > - return err; > + break; > } else { > - if (err < 0) > - return err; > + if (err) > + break; > > list_del_init(&uvma->userptr.repin_link); > list_move_tail(&uvma->vma.combined_links.rebind, > @@ -702,7 +702,19 @@ int xe_vm_userptr_pin(struct xe_vm *vm) > } > } > > - return 0; > + if (err) { > + down_write(&vm->userptr.notifier_lock); Can you explain why you take the notifier lock here? I don't think this required unless I'm missing something. Matt > + spin_lock(&vm->userptr.invalidated_lock); > + list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list, > + userptr.repin_link) { > + list_del_init(&uvma->userptr.repin_link); > + list_move_tail(&uvma->userptr.invalidate_link, > + &vm->userptr.invalidated); > + } > + spin_unlock(&vm->userptr.invalidated_lock); > + up_write(&vm->userptr.notifier_lock); > + } > + return err; > } > > /** > -- > 2.48.1 >