[PATCH 07/25] drm/amdgpu: Update kgd2kfd_shared_resources for dGPU support

ckoenig.leichtzumerken@xxxxxxxxx (Christian König) · Tue, 30 Jan 2018 10:13:59 +0100

Am 29.01.2018 um 21:25 schrieb Felix Kuehling:
> On 2018-01-29 06:42 AM, Christian KÃ¶nig wrote:
>> Am 29.01.2018 um 00:02 schrieb Felix Kuehling:
>>> On 2018-01-27 04:19 AM, Christian KÃ¶nig wrote:
>>>> Am 27.01.2018 um 02:09 schrieb Felix Kuehling:
>>>>> Add GPUVM size and DRM render node. Also add function to query the
>>>>> VMID mask to avoid hard-coding it in multiple places later.
>>>>>
>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>> ---
>>>>>  Â Â  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.cÂ Â Â Â Â  | 19
>>>>> +++++++++++++++++--
>>>>>  Â Â  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.hÂ Â Â Â Â  |Â  2 ++
>>>>>  Â Â  drivers/gpu/drm/amd/include/kgd_kfd_interface.h |Â  6 ++++++
>>>>>  Â Â  3 files changed, 25 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> index c9f204d..294c467 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>>>> @@ -30,6 +30,8 @@
>>>>>  Â Â  const struct kgd2kfd_calls *kgd2kfd;
>>>>>  Â Â  bool (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
>>>>>  Â Â  +static const unsigned int compute_vmid_bitmap = 0xFF00;
>>>>> +
>>>>>  Â Â  int amdgpu_amdkfd_init(void)
>>>>>  Â Â  {
>>>>>  Â Â Â Â Â Â  int ret;
>>>>> @@ -137,9 +139,12 @@ void amdgpu_amdkfd_device_init(struct
>>>>> amdgpu_device *adev)
>>>>>  Â Â Â Â Â Â  int last_valid_bit;
>>>>>  Â Â Â Â Â Â  if (adev->kfd) {
>>>>>  Â Â Â Â Â Â Â Â Â Â  struct kgd2kfd_shared_resources gpu_resources = {
>>>>> -Â Â Â Â Â Â Â Â Â Â Â  .compute_vmid_bitmap = 0xFF00,
>>>>> +Â Â Â Â Â Â Â Â Â Â Â  .compute_vmid_bitmap = compute_vmid_bitmap,
>>>>>  Â Â Â Â Â Â Â Â Â Â Â Â Â Â  .num_pipe_per_mec = adev->gfx.mec.num_pipe_per_mec,
>>>>> -Â Â Â Â Â Â Â Â Â Â Â  .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe
>>>>> +Â Â Â Â Â Â Â Â Â Â Â  .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe,
>>>>> +Â Â Â Â Â Â Â Â Â Â Â  .gpuvm_size = adev->vm_manager.max_pfn
>>>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  << AMDGPU_GPU_PAGE_SHIFT,
>>>> That most likely doesn't work as intended on Vega10. The address space
>>>> is divided into an upper and a lower range, but max_pfn includes both.
>>>>
>>>> I suggest to use something like min(adev->vm_manager.max_pfn <<
>>>> AMDGPU_GPU_PAGE_SHIFT, AMDGPU_VM_HOLE_START).
>>> I think this is fine as it is. This just tells the Thunk the size of the
>>> virtual address space supported by the GPU. Currently the Thunk only
>>> uses 40-bits for SVM. But eventually it will be able to use the entire
>>> 47 bits of user address space. Any excess address space will just go
>>> unused.
>>>
>>> I'm also wondering how universal the split 48-bit virtual address space
>>> layout is. Even for x86_64 there seems to be a 5-level page table layout
>>> that supports 56 bits of user mode addresses
>>> (Documentation/x86/x86_64/mm.txt). AArch64 seems to support 48-bit user
>>> mode addresses (Documentation/arm64/memory.txt). I haven't found similar
>>> information for PowerPC yet.
>>>
>>> We should avoid coding too much architecture-specific logic into this
>>> driver that's supposed to support other architectures as well. I should
>>> also review the aperture placement with bigger user mode address spaces
>>> in mind.
>> And that is exactly the reason why I've suggested to change that.
>>
>> See the split between lower and upper range is not related to the CPU
>> configuration, instead it is a property of our GPU setup and hardware
>> generation.
> How exactly does the GPUVM hardware behave when it sees a virtual
> address "in the hole". Does it generate a VM fault? Or does it simply
> ignore the high bits?

At least on the Vega10 system I've tested you get a range fault when you 
try to access the hole.

> If it ignored the high bits, it would work OK for both x86_64 (with a
> split address space) and ARM64 (with a full 48-bit user mode virtual
> address space).
>
> On the other hand, if addresses in the hole generate a VM fault, then I
> agree with you, and we should only report 47-bits of virtual address
> space to user mode for SVM purposes.

Yes, according to my testing exactly that is the case here.

The hardware documentation is a bit ambivalent. So I initially thought 
as well that the high bits are just ignored, but that doesn't seem to be 
the case.

Regards,
Christian.

>
> Regards,
>  Â  Felix
>
>> So we should either split it in gpuvm_size_low/gpuvm_size_high or to
>> just use the lower range and limit the value as suggested above.
>>
>> This should avoid having any GPU generation and CPU configuration
>> specific logic in the common interface.
>>
>> Regards,
>> Christian.
>>
>>> Regards,
>>>  Â Â  Felix
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx