[PATCH v4] drm/amdgpu: Transfer fences to dmabuf importer

ckoenig.leichtzumerken@xxxxxxxxx (Christian König) · Tue, 7 Aug 2018 20:18:55 +0200

Am 07.08.2018 um 20:14 schrieb Chris Wilson:
> Quoting Christian KÃ¶nig (2018-08-07 18:57:16)
>> Am 07.08.2018 um 18:08 schrieb Chris Wilson:
>>> amdgpu only uses shared-fences internally, but dmabuf importers rely on
>>> implicit write hazard tracking via the reservation_object.fence_excl.
>>> For example, the importer use the write hazard for timing a page flip to
>>> only occur after the exporter has finished flushing its write into the
>>> surface. As such, on exporting a dmabuf, we must either flush all
>>> outstanding fences (for we do not know which are writes and should have
>>> been exclusive) or alternatively create a new exclusive fence that is
>>> the composite of all the existing shared fences, and so will only be
>>> signaled when all earlier fences are signaled (ensuring that we can not
>>> be signaled before the completion of any earlier write).
>>>
>>> v2: reservation_object is already locked by amdgpu_bo_reserve()
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107341
>>> Testcase: igt/amd_prime/amd-to-i915
>>> References: 8e94a46c1770 ("drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)")
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Alex Deucher <alexander.deucher at amd.com>
>>> Cc: "Christian KÃ¶nig" <christian.koenig at amd.com>
>>> ---
>>>
>>> This time, hopefully proofread and references complete.
>>> -Chris
>>>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 68 ++++++++++++++++++++---
>>>    1 file changed, 60 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
>>> index 1c5d97f4b4dd..dff2b01a3d89 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
>>> @@ -37,6 +37,7 @@
>>>    #include "amdgpu_display.h"
>>>    #include <drm/amdgpu_drm.h>
>>>    #include <linux/dma-buf.h>
>>> +#include <linux/dma-fence-array.h>
>>>    
>>>    static const struct dma_buf_ops amdgpu_dmabuf_ops;
>>>    
>>> @@ -188,6 +189,57 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
>>>        return ERR_PTR(ret);
>>>    }
>>>    
>>> +static int
>>> +__reservation_object_make_exclusive(struct reservation_object *obj)
>>> +{
>>> +     struct reservation_object_list *fobj;
>>> +     struct dma_fence_array *array;
>>> +     struct dma_fence **fences;
>>> +     unsigned int count, i;
>>> +
>>> +     fobj = reservation_object_get_list(obj);
>>> +     if (!fobj)
>>> +             return 0;
>>> +
>>> +     count = !!rcu_access_pointer(obj->fence_excl);
>>> +     count += fobj->shared_count;
>>> +
>>> +     fences = kmalloc_array(count, sizeof(*fences), GFP_KERNEL);
>>> +     if (!fences)
>>> +             return -ENOMEM;
>>> +
>>> +     for (i = 0; i < fobj->shared_count; i++) {
>>> +             struct dma_fence *f =
>>> +                     rcu_dereference_protected(fobj->shared[i],
>>> +                                               reservation_object_held(obj));
>>> +
>>> +             fences[i] = dma_fence_get(f);
>>> +     }
>>> +
>>> +     if (rcu_access_pointer(obj->fence_excl)) {
>>> +             struct dma_fence *f =
>>> +                     rcu_dereference_protected(obj->fence_excl,
>>> +                                               reservation_object_held(obj));
>>> +
>>> +             fences[i] = dma_fence_get(f);
>>> +     }
>>> +
>>> +     array = dma_fence_array_create(count, fences,
>>> +                                    dma_fence_context_alloc(1), 0,
>>> +                                    false);
>>> +     if (!array)
>>> +             goto err_fences_put;
>>> +
>>> +     reservation_object_add_excl_fence(obj, &array->base);
>>> +     return 0;
>>> +
>>> +err_fences_put:
>>> +     for (i = 0; i < count; i++)
>>> +             dma_fence_put(fences[i]);
>>> +     kfree(fences);
>>> +     return -ENOMEM;
>>> +}
>>> +
>> This can be simplified a lot. See amdgpu_pasid_free_delayed() for an
>> example:
> {
> 	if (!reservation_object_get_list(obj))
> 		return 0;
>
> 	r = reservation_object_get_fences_rcu(obj, NULL, &count, &fences);
> 	if (r)
> 		return r;
>
> 	array = dma_fence_array_create(count, fences,
> 				       dma_fence_context_alloc(1), 0,
> 				       false);
> 	if (!array)
> 		goto err_fences_put;
>
> 	reservation_object_add_excl_fence(obj, &array->base);
> 	return 0;
>
> err:
> 	...
> }
>
> My starting point was going to be use get_fences_rcu, but get_fences_rcu
> can hardly be called simple for where the lock is required to be held
> start to finish ;)

What are you talking about? get_fences_rcu doesn't require any locking 
at all.

You only need to the locking to make sure that between creating the 
fence array and calling reservation_object_add_excl_fence() no other 
fence is added.

Christian.

> -Chris
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx