[PATCH libdrm 2/4] amdgpu: add a function to find bo by cpu mapping (v2)

zhoucm1@xxxxxxx (zhoucm1) · Wed, 8 Aug 2018 16:43:43 +0800



On 2018å¹´08æ??08æ?¥ 14:48, Christian KÃ¶nig wrote:
> Am 08.08.2018 um 06:23 schrieb zhoucm1:
>>
>>
>> On 2018å¹´08æ??08æ?¥ 12:08, Junwei Zhang wrote:
>>> Userspace needs to know if the user memory is from BO or malloc.
>>>
>>> v2: update mutex range and rebase
>>>
>>> Signed-off-by: Junwei Zhang <Jerry.Zhang at amd.com>
>>> ---
>>> Â  amdgpu/amdgpu.hÂ Â Â  | 23 +++++++++++++++++++++++
>>> Â  amdgpu/amdgpu_bo.c | 34 ++++++++++++++++++++++++++++++++++
>>> Â  2 files changed, 57 insertions(+)
>>>
>>> diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
>>> index be83b45..a8c353c 100644
>>> --- a/amdgpu/amdgpu.h
>>> +++ b/amdgpu/amdgpu.h
>>> @@ -678,6 +678,29 @@ int 
>>> amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  amdgpu_bo_handle *buf_handle);
>>> Â  Â  /**
>>> + * Validate if the user memory comes from BO
>>> + *
>>> + * \param dev - [in] Device handle. See #amdgpu_device_initialize()
>>> + * \param cpu - [in] CPU address of user allocated memory which we
>>> + * want to map to GPU address space (make GPU accessible)
>>> + * (This address must be correctly aligned).
>>> + * \param size - [in] Size of allocation (must be correctly aligned)
>>> + * \param buf_handle - [out] Buffer handle for the userptr memory
>>> + * if the user memory is not from BO, the buf_handle will be NULL.
>>> + * \param offset_in_bo - [out] offset in this BO for this user memory
>>> + *
>>> + *
>>> + * \returnÂ Â  0 on success\n
>>> + *Â Â Â Â Â Â Â Â Â  <0 - Negative POSIX Error code
>>> + *
>>> +*/
>>> +int amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  void *cpu,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  uint64_t size,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  amdgpu_bo_handle *buf_handle,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  uint64_t *offset_in_bo);
>>> +
>>> +/**
>>> Â Â  * Free previosuly allocated memory
>>> Â Â  *
>>> Â Â  * \paramÂ Â  devÂ Â Â Â Â Â Â Â Â Â  - \c [in] Device handle. See 
>>> #amdgpu_device_initialize()
>>> diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
>>> index b24e698..a7f0662 100644
>>> --- a/amdgpu/amdgpu_bo.c
>>> +++ b/amdgpu/amdgpu_bo.c
>>> @@ -529,6 +529,40 @@ int amdgpu_bo_wait_for_idle(amdgpu_bo_handle bo,
>>> Â Â Â Â Â  }
>>> Â  }
>>> Â  +int amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  void *cpu,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  uint64_t size,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  amdgpu_bo_handle *buf_handle,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  uint64_t *offset_in_bo)
>>> +{
>>> +Â Â Â  int i;
>>> +Â Â Â  struct amdgpu_bo *bo;
>>> +
>>> +Â Â Â  if (cpu == NULL || size == 0)
>>> +Â Â Â Â Â Â Â  return -EINVAL;
>>> +
>>> +Â Â Â  pthread_mutex_lock(&dev->bo_table_mutex);
>>> +Â Â Â  for (i = 0; i < dev->bo_handles.max_key; i++) {
>> Hi Jerry,
>>
>> As Christian catched before, iterating all BOs of device will 
>> introduce much CPU overhead, this isn't good direction.
>> Since cpu virtual address is per-process, you should go to kernel to 
>> find them from vm tree, which obviously takes less time.
>
> Yeah, but is also much more overhead to maintain.
>
> Since this is only to fix the behavior of a single buggy application 
> at least I'm fine to keep the workaround as simple as this.
I like 'workaround' expression, if Jerry adds 'workaround' comments 
here, I'm ok as well.

Regards,
David Zhou
>
> If we find a wider use we can still start to use the kernel 
> implementation again.
>
> Regards,
> Christian.
>
>>
>> Regards,
>> David Zhou
>>> +Â Â Â Â Â Â Â  bo = handle_table_lookup(&dev->bo_handles, i);
>>> +Â Â Â Â Â Â Â  if (!bo || !bo->cpu_ptr || size > bo->alloc_size)
>>> +Â Â Â Â Â Â Â Â Â Â Â  continue;
>>> +Â Â Â Â Â Â Â  if (cpu >= bo->cpu_ptr && cpu < (bo->cpu_ptr + 
>>> bo->alloc_size))
>>> +Â Â Â Â Â Â Â Â Â Â Â  break;
>>> +Â Â Â  }
>>> +
>>> +Â Â Â  if (i < dev->bo_handles.max_key) {
>>> +Â Â Â Â Â Â Â  atomic_inc(&bo->refcount);
>>> +Â Â Â Â Â Â Â  *buf_handle = bo;
>>> +Â Â Â Â Â Â Â  *offset_in_bo = cpu - bo->cpu_ptr;
>>> +Â Â Â  } else {
>>> +Â Â Â Â Â Â Â  *buf_handle = NULL;
>>> +Â Â Â Â Â Â Â  *offset_in_bo = 0;
>>> +Â Â Â  }
>>> +Â Â Â  pthread_mutex_unlock(&dev->bo_table_mutex);
>>> +
>>> +Â Â Â  return 0;
>>> +}
>>> +
>>> Â  int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  void *cpu,
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  uint64_t size,
>>
>