[PATCH 07/25] drm/amdgpu: Update kgd2kfd_shared_resources for dGPU support

felix.kuehling@xxxxxxx (Felix Kuehling) · Mon, 29 Jan 2018 15:25:35 -0500

On 2018-01-29 06:42 AM, Christian KÃ¶nig wrote:
> Am 29.01.2018 um 00:02 schrieb Felix Kuehling:
>> On 2018-01-27 04:19 AM, Christian KÃ¶nig wrote:
>>> Am 27.01.2018 um 02:09 schrieb Felix Kuehling:
>>>> Add GPUVM size and DRM render node. Also add function to query the
>>>> VMID mask to avoid hard-coding it in multiple places later.
>>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>> ---
>>>> Â Â  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.cÂ Â Â Â Â  | 19
>>>> +++++++++++++++++--
>>>> Â Â  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.hÂ Â Â Â Â  |Â  2 ++
>>>> Â Â  drivers/gpu/drm/amd/include/kgd_kfd_interface.h |Â  6 ++++++
>>>> Â Â  3 files changed, 25 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>> index c9f204d..294c467 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>> @@ -30,6 +30,8 @@
>>>> Â Â  const struct kgd2kfd_calls *kgd2kfd;
>>>> Â Â  bool (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
>>>> Â Â  +static const unsigned int compute_vmid_bitmap = 0xFF00;
>>>> +
>>>> Â Â  int amdgpu_amdkfd_init(void)
>>>> Â Â  {
>>>> Â Â Â Â Â Â  int ret;
>>>> @@ -137,9 +139,12 @@ void amdgpu_amdkfd_device_init(struct
>>>> amdgpu_device *adev)
>>>> Â Â Â Â Â Â  int last_valid_bit;
>>>> Â Â Â Â Â Â  if (adev->kfd) {
>>>> Â Â Â Â Â Â Â Â Â Â  struct kgd2kfd_shared_resources gpu_resources = {
>>>> -Â Â Â Â Â Â Â Â Â Â Â  .compute_vmid_bitmap = 0xFF00,
>>>> +Â Â Â Â Â Â Â Â Â Â Â  .compute_vmid_bitmap = compute_vmid_bitmap,
>>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â  .num_pipe_per_mec = adev->gfx.mec.num_pipe_per_mec,
>>>> -Â Â Â Â Â Â Â Â Â Â Â  .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe
>>>> +Â Â Â Â Â Â Â Â Â Â Â  .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe,
>>>> +Â Â Â Â Â Â Â Â Â Â Â  .gpuvm_size = adev->vm_manager.max_pfn
>>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  << AMDGPU_GPU_PAGE_SHIFT,
>>> That most likely doesn't work as intended on Vega10. The address space
>>> is divided into an upper and a lower range, but max_pfn includes both.
>>>
>>> I suggest to use something like min(adev->vm_manager.max_pfn <<
>>> AMDGPU_GPU_PAGE_SHIFT, AMDGPU_VM_HOLE_START).
>> I think this is fine as it is. This just tells the Thunk the size of the
>> virtual address space supported by the GPU. Currently the Thunk only
>> uses 40-bits for SVM. But eventually it will be able to use the entire
>> 47 bits of user address space. Any excess address space will just go
>> unused.
>>
>> I'm also wondering how universal the split 48-bit virtual address space
>> layout is. Even for x86_64 there seems to be a 5-level page table layout
>> that supports 56 bits of user mode addresses
>> (Documentation/x86/x86_64/mm.txt). AArch64 seems to support 48-bit user
>> mode addresses (Documentation/arm64/memory.txt). I haven't found similar
>> information for PowerPC yet.
>>
>> We should avoid coding too much architecture-specific logic into this
>> driver that's supposed to support other architectures as well. I should
>> also review the aperture placement with bigger user mode address spaces
>> in mind.
>
> And that is exactly the reason why I've suggested to change that.
>
> See the split between lower and upper range is not related to the CPU
> configuration, instead it is a property of our GPU setup and hardware
> generation.

How exactly does the GPUVM hardware behave when it sees a virtual
address "in the hole". Does it generate a VM fault? Or does it simply
ignore the high bits?

If it ignored the high bits, it would work OK for both x86_64 (with a
split address space) and ARM64 (with a full 48-bit user mode virtual
address space).

On the other hand, if addresses in the hole generate a VM fault, then I
agree with you, and we should only report 47-bits of virtual address
space to user mode for SVM purposes.

Regards,
Â  Felix

>
> So we should either split it in gpuvm_size_low/gpuvm_size_high or to
> just use the lower range and limit the value as suggested above.
>
> This should avoid having any GPU generation and CPU configuration
> specific logic in the common interface.
>
> Regards,
> Christian.
>
>>
>> Regards,
>> Â Â  Felix
>