Re: [PATCH] drm/syncobj: Deal with signalled fences in transfer.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 7, 2021 at 12:28 PM Lionel Landwerlin
<lionel.g.landwerlin@xxxxxxxxx> wrote:
>
> On 07/12/2021 13:00, Christian König wrote:
> > Am 07.12.21 um 11:40 schrieb Bas Nieuwenhuizen:
> >> On Tue, Dec 7, 2021 at 8:21 AM Christian König
> >> <christian.koenig@xxxxxxx> wrote:
> >>> Am 07.12.21 um 08:10 schrieb Lionel Landwerlin:
> >>>> On 07/12/2021 03:32, Bas Nieuwenhuizen wrote:
> >>>>> See the comments in the code. Basically if the seqno is already
> >>>>> signalled then we get a NULL fence. If we then put the NULL fence
> >>>>> in a binary syncobj it counts as unsignalled, making that syncobj
> >>>>> pretty much useless for all expected uses.
> >>>>>
> >>>>> Not 100% sure about the transfer to a timeline syncobj but I
> >>>>> believe it is needed there too, as AFAICT the add_point function
> >>>>> assumes the fence isn't NULL.
> >>>>>
> >>>>> Fixes: ea569910cbab ("drm/syncobj: add transition iotcls between
> >>>>> binary and timeline v2")
> >>>>> Cc: stable@xxxxxxxxxxxxxxx
> >>>>> Signed-off-by: Bas Nieuwenhuizen <bas@xxxxxxxxxxxxxxxxxxx>
> >>>>> ---
> >>>>>    drivers/gpu/drm/drm_syncobj.c | 26 ++++++++++++++++++++++++++
> >>>>>    1 file changed, 26 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
> >>>>> b/drivers/gpu/drm/drm_syncobj.c
> >>>>> index fdd2ec87cdd1..eb28a40400d2 100644
> >>>>> --- a/drivers/gpu/drm/drm_syncobj.c
> >>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
> >>>>> @@ -861,6 +861,19 @@ static int
> >>>>> drm_syncobj_transfer_to_timeline(struct drm_file *file_private,
> >>>>>                         &fence);
> >>>>>        if (ret)
> >>>>>            goto err;
> >>>>> +
> >>>>> +    /* If the requested seqno is already signaled
> >>>>> drm_syncobj_find_fence may
> >>>>> +     * return a NULL fence. To make sure the recipient gets
> >>>>> signalled, use
> >>>>> +     * a new fence instead.
> >>>>> +     */
> >>>>> +    if (!fence) {
> >>>>> +        fence = dma_fence_allocate_private_stub();
> >>>>> +        if (!fence) {
> >>>>> +            ret = -ENOMEM;
> >>>>> +            goto err;
> >>>>> +        }
> >>>>> +    }
> >>>>> +
> >>>>
> >>>> Shouldn't we fix drm_syncobj_find_fence() instead?
> >>> Mhm, now that you mention it. Bas, why do you think that
> >>> dma_fence_chain_find_seqno() may return NULL when the fence is already
> >>> signaled?
> >>>
> >>> Double checking the code that should never ever happen.
> >> Well, I tested the patch with
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fmerge_requests%2F14097%2Fdiffs%3Fcommit_id%3Dd4c5c840f4e3839f9f5c1747a9034eb2b565f5c0&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cc1ab29fc100842826f5d08d9b96e102a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637744705383763833%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=sXkTJWm%2FWm2xwgLGdepVWAOlqj%2FeArnvmMvnJpQ9YEs%3D&amp;reserved=0
> >>
> >> so I'm pretty sure it happens, and this patch fixes  it, though I may
> >> have misidentified what the code should do.
> >>
> >> My reading is that the dma_fence_chain_for_each in
> >> dma_fence_chain_find_seqno will never visit a signalled fence (unless
> >> the top one is signalled), as dma_fence_chain_walk will never return a
> >> signalled fence (it only returns on NULL or !signalled).
> >
> > Ah, yes that suddenly makes more sense.
> >
> >> Happy to move this to drm_syncobj_find_fence.
> >
> > No, I think that your current patch is fine.
> >
> > That drm_syncobj_find_fence() only returns NULL when it can't find
> > anything !signaled is correct behavior I think.
>
>
> We should probably update the docs then :
>
>
>   * Returns 0 on success or a negative error value on failure. On
> success @fence
>   * contains a reference to the fence, which must be released by calling
>   * dma_fence_put().
>
>
> Looking at some of the kernel drivers, it looks like they don't all
> protect themselves against NULL pointers :
>
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/vc4/vc4_gem.c#L1195
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c#L1020

amdgpu handles it here (amdgpu_sync_fence checks for a NULL fence).
But yeah I think it is a bit treacherous, especially as this only
occurs with timeline semaphores.

>
>
> -Lionel
>
>
> >
> > Going to push your original patch if nobody has any more objections.
> >
> > But somebody might want to take care of the IGT as well.
> >
> > Regards,
> > Christian.
> >
> >>> Regards,
> >>> Christian.
> >>>
> >>>> By returning a stub fence for the timeline case if there isn't one.
> >>>>
> >>>>
> >>>> Because the same NULL fence check appears missing in amdgpu (and
> >>>> probably other drivers).
> >>>>
> >>>>
> >>>> Also we should have tests for this in IGT.
> >>>>
> >>>> AMD contributed some tests when this code was written but they never
> >>>> got reviewed :(
> >>>>
> >>>>
> >>>> -Lionel
> >>>>
> >>>>
> >>>>>        chain = kzalloc(sizeof(struct dma_fence_chain), GFP_KERNEL);
> >>>>>        if (!chain) {
> >>>>>            ret = -ENOMEM;
> >>>>> @@ -890,6 +903,19 @@ drm_syncobj_transfer_to_binary(struct drm_file
> >>>>> *file_private,
> >>>>>                         args->src_point, args->flags, &fence);
> >>>>>        if (ret)
> >>>>>            goto err;
> >>>>> +
> >>>>> +    /* If the requested seqno is already signaled
> >>>>> drm_syncobj_find_fence may
> >>>>> +     * return a NULL fence. To make sure the recipient gets
> >>>>> signalled, use
> >>>>> +     * a new fence instead.
> >>>>> +     */
> >>>>> +    if (!fence) {
> >>>>> +        fence = dma_fence_allocate_private_stub();
> >>>>> +        if (!fence) {
> >>>>> +            ret = -ENOMEM;
> >>>>> +            goto err;
> >>>>> +        }
> >>>>> +    }
> >>>>> +
> >>>>>        drm_syncobj_replace_fence(binary_syncobj, fence);
> >>>>>        dma_fence_put(fence);
> >>>>>    err:
> >>>>
> >
>




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux