Am 29.01.2018 um 21:25 schrieb Felix Kuehling: > On 2018-01-29 06:42 AM, Christian König wrote: >> Am 29.01.2018 um 00:02 schrieb Felix Kuehling: >>> On 2018-01-27 04:19 AM, Christian König wrote: >>>> Am 27.01.2018 um 02:09 schrieb Felix Kuehling: >>>>> Add GPUVM size and DRM render node. Also add function to query the >>>>> VMID mask to avoid hard-coding it in multiple places later. >>>>> >>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com> >>>>> --- >>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c     | 19 >>>>> +++++++++++++++++-- >>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h     | 2 ++ >>>>>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 6 ++++++ >>>>>   3 files changed, 25 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> index c9f204d..294c467 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> @@ -30,6 +30,8 @@ >>>>>   const struct kgd2kfd_calls *kgd2kfd; >>>>>   bool (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**); >>>>>   +static const unsigned int compute_vmid_bitmap = 0xFF00; >>>>> + >>>>>   int amdgpu_amdkfd_init(void) >>>>>   { >>>>>       int ret; >>>>> @@ -137,9 +139,12 @@ void amdgpu_amdkfd_device_init(struct >>>>> amdgpu_device *adev) >>>>>       int last_valid_bit; >>>>>       if (adev->kfd) { >>>>>           struct kgd2kfd_shared_resources gpu_resources = { >>>>> -           .compute_vmid_bitmap = 0xFF00, >>>>> +           .compute_vmid_bitmap = compute_vmid_bitmap, >>>>>               .num_pipe_per_mec = adev->gfx.mec.num_pipe_per_mec, >>>>> -           .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe >>>>> +           .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe, >>>>> +           .gpuvm_size = adev->vm_manager.max_pfn >>>>> +                       << AMDGPU_GPU_PAGE_SHIFT, >>>> That most likely doesn't work as intended on Vega10. The address space >>>> is divided into an upper and a lower range, but max_pfn includes both. >>>> >>>> I suggest to use something like min(adev->vm_manager.max_pfn << >>>> AMDGPU_GPU_PAGE_SHIFT, AMDGPU_VM_HOLE_START). >>> I think this is fine as it is. This just tells the Thunk the size of the >>> virtual address space supported by the GPU. Currently the Thunk only >>> uses 40-bits for SVM. But eventually it will be able to use the entire >>> 47 bits of user address space. Any excess address space will just go >>> unused. >>> >>> I'm also wondering how universal the split 48-bit virtual address space >>> layout is. Even for x86_64 there seems to be a 5-level page table layout >>> that supports 56 bits of user mode addresses >>> (Documentation/x86/x86_64/mm.txt). AArch64 seems to support 48-bit user >>> mode addresses (Documentation/arm64/memory.txt). I haven't found similar >>> information for PowerPC yet. >>> >>> We should avoid coding too much architecture-specific logic into this >>> driver that's supposed to support other architectures as well. I should >>> also review the aperture placement with bigger user mode address spaces >>> in mind. >> And that is exactly the reason why I've suggested to change that. >> >> See the split between lower and upper range is not related to the CPU >> configuration, instead it is a property of our GPU setup and hardware >> generation. > How exactly does the GPUVM hardware behave when it sees a virtual > address "in the hole". Does it generate a VM fault? Or does it simply > ignore the high bits? At least on the Vega10 system I've tested you get a range fault when you try to access the hole. > If it ignored the high bits, it would work OK for both x86_64 (with a > split address space) and ARM64 (with a full 48-bit user mode virtual > address space). > > On the other hand, if addresses in the hole generate a VM fault, then I > agree with you, and we should only report 47-bits of virtual address > space to user mode for SVM purposes. Yes, according to my testing exactly that is the case here. The hardware documentation is a bit ambivalent. So I initially thought as well that the high bits are just ignored, but that doesn't seem to be the case. Regards, Christian. > > Regards, >  Felix > >> So we should either split it in gpuvm_size_low/gpuvm_size_high or to >> just use the lower range and limit the value as suggested above. >> >> This should avoid having any GPU generation and CPU configuration >> specific logic in the common interface. >> >> Regards, >> Christian. >> >>> Regards, >>>   Felix > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx