[PATCH libdrm] libdrm_amdgpu: add kernel semaphore support

airlied@xxxxxxxxx (Dave Airlie) · Tue, 18 Jul 2017 13:00:49 +1000

On 18 July 2017 at 03:02, Christian KÃ¶nig <deathsimple at vodafone.de> wrote:
> Am 17.07.2017 um 05:36 schrieb Dave Airlie:
>>>
>>> I can take a look at it, I just won't have time until next week most
>>> likely.
>>
>> I've taken a look, and it's seemingly more complicated than I'm
>> expecting I'd want to land in Mesa before 17.2 ships, I'd really
>> prefer to just push the new libdrm_amdgpu api from this patch. If I
>> have to port all the current radv code to the new API, I'll most
>> definitely get something wrong.
>>
>> Adding the new API so far looks like
>> https://cgit.freedesktop.org/~airlied/drm/log/?h=drm-amdgpu-cs-submit-raw
>>
>>
>> https://cgit.freedesktop.org/~airlied/drm/commit/?h=drm-amdgpu-cs-submit-raw&id=e7f85d0ca617fa41e72624780c9035df132e23c4
>> being the API, and whether it should take a uint32_t context id or
>> context handle left as an open question in the last patch in the
>> series.
>
>
> I would stick with the context handle, as far as I can see there isn't any
> value in using the uint32_t for this.
>
> We just want to be able to send arbitrary chunks down into the kernel
> without libdrm_amdgpu involvement and/or the associated overhead of the
> extra loop and the semaphore handling.
>
> So your "amdgpu/cs: add new raw cs submission interface just taking chunks"
> patch looks fine to me as far as I can tell.
>
> As far as I can see the "amdgpu: refactor semaphore handling" patch is
> actually incorrect. We must hole the mutex while sending the CS down to the
> kernel, or otherwise "context->last_seq" won't be accurate.
>
>> However to hook this into radv or radeonsi will take a bit of
>> rewriting of a lot of code that is probably a bit more fragile than
>> I'd like for this sort of surgery at this point.
>
>
> Again, I can move over the existing Mesa stuff if you like.
>
>> I'd actually suspect if we do want to proceed with this type of
>> interface, we might be better doing it all in common mesa code, and
>> maybe bypassing libdrm_amdgpu altogether, which I suppose the API I've
>> written here is mostly already doing.
>
>
> I want to stick with the other interfaces for now. No need to make it more
> complicated than it already is.
>
> Only the CS stuff is the most performance critical and thing we have right
> now.

As I suspected this plan is full of traps.

So with the raw cs api I posted (using amdgpu_bo_list_handle instead), I ran
into two places the abstraction cuts me.

  CC       winsys/amdgpu/radv_amdgpu_cs.lo
winsys/amdgpu/radv_amdgpu_cs.c: In function â??radv_amdgpu_cs_submitâ??:
winsys/amdgpu/radv_amdgpu_cs.c:1173:63: error: dereferencing pointer
to incomplete type â??struct amdgpu_boâ??
   chunk_data[i].fence_data.handle = request->fence_info.handle->handle;
                                                               ^~
winsys/amdgpu/radv_amdgpu_cs.c:1193:31: error: dereferencing pointer
to incomplete type â??struct amdgpu_contextâ??
    dep->ctx_id = info->context->id;

In order to do user fence chunk I need the actual bo handle not the
amdgpu wrapped one, we don't have an accessor method for that.

In order to do the dependencies chunks, I need a context id.

Now I suppose I can add chunk creation helpers to libdrm, but it does
seems like it breaks the future proof interface if we can't access the
details of a bunch of objects we want to pass through to the kernel
API.

Dave.