Re: [PATCH 06/13] drm/amdgpu: use the new drm_exec object for CS v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/20/23 17:16, Tatsuyuki Ishi wrote:
On 6/20/23 17:12, Christian König wrote:
Am 20.06.23 um 06:07 schrieb Tatsuyuki Ishi:
@@ -928,18 +874,56 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
          e->user_invalidated = userpage_invalidated;
      }
  -    r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
-                   &duplicates);
-    if (unlikely(r != 0)) {
-        if (r != -ERESTARTSYS)
-            DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
-        goto out_free_user_pages;
+    drm_exec_while_not_all_locked(&p->exec) {
+        r = amdgpu_vm_lock_pd(&fpriv->vm, &p->exec);
+        drm_exec_continue_on_contention(&p->exec);

Duplicate handling is needed for pretty much every call of amdgpu_vm_lock_pd, as bo->tbo.base.resv == vm->root.bo->tbo.base.resv for AMDGPU_GEM_CREATE_VM_ALWAYS_VALID.

Well no. AMDGPU_GEM_CREATE_VM_ALWAYS_VALID means that BOs should *not* be part of the relocation list. So when those cause an EALREADY here then userspace has a bug.

Sounds fair, lemme check how RADV is handling this again.

I checked again and relocation list was actually fine, but other places were not. For example amdgpu_gem_object_close
locks both bo->tbo.base.resv and vm->root.bo->tbo.base.resv (PD) on its own.

This was the easily debuggable case since it caused an error log but some other BO operations on ALWAYS_VALID
is also presumably broken due to the same reason.

Tatsuyuki



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux