Am 07.08.2018 um 20:14 schrieb Chris Wilson: > Quoting Christian König (2018-08-07 18:57:16) >> Am 07.08.2018 um 18:08 schrieb Chris Wilson: >>> amdgpu only uses shared-fences internally, but dmabuf importers rely on >>> implicit write hazard tracking via the reservation_object.fence_excl. >>> For example, the importer use the write hazard for timing a page flip to >>> only occur after the exporter has finished flushing its write into the >>> surface. As such, on exporting a dmabuf, we must either flush all >>> outstanding fences (for we do not know which are writes and should have >>> been exclusive) or alternatively create a new exclusive fence that is >>> the composite of all the existing shared fences, and so will only be >>> signaled when all earlier fences are signaled (ensuring that we can not >>> be signaled before the completion of any earlier write). >>> >>> v2: reservation_object is already locked by amdgpu_bo_reserve() >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341 >>> Testcase: igt/amd_prime/amd-to-i915 >>> References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)") >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk> >>> Cc: Alex Deucher <alexander.deucher at amd.com> >>> Cc: "Christian König" <christian.koenig at amd.com> >>> --- >>> >>> This time, hopefully proofread and references complete. >>> -Chris >>> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 68 ++++++++++++++++++++--- >>> 1 file changed, 60 insertions(+), 8 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c >>> index 1c5d97f4b4dd..dff2b01a3d89 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c >>> @@ -37,6 +37,7 @@ >>> #include "amdgpu_display.h" >>> #include <drm/amdgpu_drm.h> >>> #include <linux/dma-buf.h> >>> +#include <linux/dma-fence-array.h> >>> >>> static const struct dma_buf_ops amdgpu_dmabuf_ops; >>> >>> @@ -188,6 +189,57 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev, >>> return ERR_PTR(ret); >>> } >>> >>> +static int >>> +__reservation_object_make_exclusive(struct reservation_object *obj) >>> +{ >>> + struct reservation_object_list *fobj; >>> + struct dma_fence_array *array; >>> + struct dma_fence **fences; >>> + unsigned int count, i; >>> + >>> + fobj = reservation_object_get_list(obj); >>> + if (!fobj) >>> + return 0; >>> + >>> + count = !!rcu_access_pointer(obj->fence_excl); >>> + count += fobj->shared_count; >>> + >>> + fences = kmalloc_array(count, sizeof(*fences), GFP_KERNEL); >>> + if (!fences) >>> + return -ENOMEM; >>> + >>> + for (i = 0; i < fobj->shared_count; i++) { >>> + struct dma_fence *f = >>> + rcu_dereference_protected(fobj->shared[i], >>> + reservation_object_held(obj)); >>> + >>> + fences[i] = dma_fence_get(f); >>> + } >>> + >>> + if (rcu_access_pointer(obj->fence_excl)) { >>> + struct dma_fence *f = >>> + rcu_dereference_protected(obj->fence_excl, >>> + reservation_object_held(obj)); >>> + >>> + fences[i] = dma_fence_get(f); >>> + } >>> + >>> + array = dma_fence_array_create(count, fences, >>> + dma_fence_context_alloc(1), 0, >>> + false); >>> + if (!array) >>> + goto err_fences_put; >>> + >>> + reservation_object_add_excl_fence(obj, &array->base); >>> + return 0; >>> + >>> +err_fences_put: >>> + for (i = 0; i < count; i++) >>> + dma_fence_put(fences[i]); >>> + kfree(fences); >>> + return -ENOMEM; >>> +} >>> + >> This can be simplified a lot. See amdgpu_pasid_free_delayed() for an >> example: > { > if (!reservation_object_get_list(obj)) > return 0; > > r = reservation_object_get_fences_rcu(obj, NULL, &count, &fences); > if (r) > return r; > > array = dma_fence_array_create(count, fences, > dma_fence_context_alloc(1), 0, > false); > if (!array) > goto err_fences_put; > > reservation_object_add_excl_fence(obj, &array->base); > return 0; > > err: > ... > } > > My starting point was going to be use get_fences_rcu, but get_fences_rcu > can hardly be called simple for where the lock is required to be held > start to finish ;) What are you talking about? get_fences_rcu doesn't require any locking at all. You only need to the locking to make sure that between creating the fence array and calling reservation_object_add_excl_fence() no other fence is added. Christian. > -Chris > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx