[PATCH 1/2] drm/amdgpu: return bo itself if userptr is cpu addr of bo (v3)

christian.koenig@xxxxxxx (Christian König) · Wed, 1 Aug 2018 20:29:35 +0200

Am 01.08.2018 um 19:59 schrieb Marek OlÅ¡Ã¡k:
> On Wed, Aug 1, 2018 at 1:52 PM, Christian KÃ¶nig
> <christian.koenig at amd.com> wrote:
>> Am 01.08.2018 um 19:39 schrieb Marek OlÅ¡Ã¡k:
>>> On Wed, Aug 1, 2018 at 2:32 AM, Christian KÃ¶nig
>>> <christian.koenig at amd.com> wrote:
>>>> Am 01.08.2018 um 00:07 schrieb Marek OlÅ¡Ã¡k:
>>>>> Can this be implemented as a wrapper on top of libdrm? So that the
>>>>> tree (or hash table) isn't created for UMDs that don't need it.
>>>>
>>>> No, the problem is that an application gets a CPU pointer from one API
>>>> and
>>>> tries to import that pointer into another one.
>>>>
>>>> In other words we need to implement this independent of the UMD who
>>>> mapped
>>>> the BO.
>>> Yeah, it could be an optional feature of libdrm, and other components
>>> should be able to disable it to remove the overhead.
>>
>> The overhead is negligible, the real problem is the memory footprint.
>>
>> A brief look at the hash implementation in libdrm showed that this is
>> actually really inefficient.
>>
>> I think we have the choice of implementing a r/b tree to map the CPU pointer
>> addresses or implement a quadratic tree to map the handles.
>>
>> The later is easy to do and would also allow to get rid of the hash table as
>> well.
> We can also use the hash table from mesa/src/util.
>
> I don't think the overhead would be negligible. It would be a log(n)
> insertion in bo_map and a log(n) deletion in bo_unmap. If you did
> bo_map+bo_unmap 10000 times, would it be negligible?

Compared to what the kernel needs to do for updating the page tables it 
is less than 1% of the total work.

The real question is if it wouldn't be simpler to use a tree for the 
handles. Since the handles are dense you can just use an unbalanced tree 
which is really easy.

For a tree of the CPU mappings we would need an r/b interval tree, which 
is hard to implement and quite some overkill.

Do you have any numbers how many BOs really get a CPU mapping in a real 
world application?

Christian.

>
> Marek