Am 20.11.2017 um 15:46 schrieb Michel Dänzer: > On 20/11/17 02:20 PM, Christian König wrote: >> Am 20.11.2017 um 11:36 schrieb Michel Dänzer: >>> On 18/11/17 03:31 PM, Christian König wrote: >>>> Am 17.11.2017 um 17:09 schrieb Michel Dänzer: >>>>> On 17/11/17 11:28 AM, Christian König wrote: >>>>>> Ping? Michel, Alex can somebody take a look? >>>>> Patch 2 is >>>>> >>>>> Reviewed-by: Michel Dänzer <michel.daenzer at amd.com> >>>>> >>>>> >>>>> With patches 1 & 3, it's not 100% clear to me what the idea is behind >>>>> the handling of the hole on the kernel and userspace side. Maybe you >>>>> can >>>>> add some explanation in code comments or the commit logs? >>>> Yeah, that is actually a bit of a mess because the hardware >>>> documentation wasn't very clear on how this works. >>>> >>>> How about this as extra code comment on patch 1 to the assignment of >>>> dev_info.virtual_address_max: >>>> >>>> /* >>>>   * Old userspace isn't aware of the VA hole on Vega10. So in theory an >>>> client could get invalid VA addresses assigned. >>>>   * To fix this and keep backward compatibility we limit the VA space >>>> reported in this field to the range below the hole. >>>>   */ >>>> >>>> The last patch is then to report the VA space above the hole, cause that >>>> is actually what libdrm should use. >>>> >>>> The crux is when I put the VA space above the hole directly into the old >>>> fields older versions of libdrm would break and we can't do that. >>> That much was actually clear. :) Maybe some questions will expose what >>> I'm missing: >>> >>> >>> Patch 1: >>> >>>> @@ -880,14 +880,14 @@ static int amdgpu_cs_ib_vm_chunk(struct >>>> amdgpu_device *adev, >>>>              if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB) >>>>                  continue; >>>>  -           r = amdgpu_cs_find_mapping(p, chunk_ib->va_start, >>>> -                          &aobj, &m); >>>> +           va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK; >>>> +           r = amdgpu_cs_find_mapping(p, va_start, &aobj, &m); >>> I don't understand how userspace can make use of the address space above >>> the hole, since AMDGPU_VA_HOLE_MASK is 0x0000ffffffffffffULL and >>> AMDGPU_VA_HOLE_END is 0xffff800000000000ULL, so masking with >>> AMDGPU_VA_HOLE_MASK always results in an address below or inside the >>> hole? >> Yes, and exactly that is intended here. >> >> See for programming the MC and filling the page tables you handle this >> as if there isn't a hole in the address space. >> >> Only when you start to use the address from userspace you will find that >> you need to sign extend bit 47 into bits 48-63. >> >> Applying the AMDGPU_VA_HOLE_MASK to and address masks out the upper 16 >> bits and so removes the sign extension again. >> >>> >>>> @@ -566,6 +566,17 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, >>>> void *data, >>>>          return -EINVAL; >>>>      } >>>>  +   if (args->va_address >= AMDGPU_VA_HOLE_START && >>>> +       args->va_address < AMDGPU_VA_HOLE_END) { >>>> +       dev_dbg(&dev->pdev->dev, >>>> +           "va_address 0x%LX is in VA hole 0x%LX-0x%LX\n", >>>> +           args->va_address, AMDGPU_VA_HOLE_START, >>>> +           AMDGPU_VA_HOLE_END); >>>> +       return -EINVAL; >>>> +   } >>>> + >>>> +   args->va_address &= AMDGPU_VA_HOLE_MASK; >>> Ignoring the above, I think the masking with AMDGPU_VA_HOLE_MASK would >>> need to be done before checking for the hole, otherwise the masked >>> address could be inside the hole? >> The masking just removes the sign extension which created the hole in >> the first place. >> >> The VM and MC is programmed as if there isn't a hole. >> >>> Patch 3: >>> >>>> @@ -577,10 +578,17 @@ static int amdgpu_info_ioctl(struct drm_device >>>> *dev, void *data, struct drm_file >>>>              dev_info.ids_flags |= AMDGPU_IDS_FLAGS_FUSION; >>>>          if (amdgpu_sriov_vf(adev)) >>>>              dev_info.ids_flags |= AMDGPU_IDS_FLAGS_PREEMPTION; >>>> + >>>> +       vm_size = adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE; >>>>          dev_info.virtual_address_offset = AMDGPU_VA_RESERVED_SIZE; >>>>          dev_info.virtual_address_max = >>>> -           min(adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE, >>>> -               AMDGPU_VA_HOLE_START); >>>> +           min(vm_size, AMDGPU_VA_HOLE_START); >>>> + >>>> +       vm_size -= AMDGPU_VA_RESERVED_SIZE; >>>> +       if (vm_size > AMDGPU_VA_HOLE_START) { >>>> +           dev_info.high_va_offset = AMDGPU_VA_HOLE_END; >>>> +           dev_info.high_va_max = AMDGPU_VA_HOLE_END | vm_size; >>>> +       } >>> This mostly makes sense, except for the >>> >>>        dev_info.high_va_max = AMDGPU_VA_HOLE_END | vm_size; >>> >>> line. Can you explain how simply or'ing together the address of the end >>> of the hole and the VM size works correctly in all cases? >>> >>> As an extreme example, with vm_size == 1, this value would compute as >>> 0xffff800000000001ULL, which seems wrong or at least weird. >> We check above if vm_size is larger than AMDGPU_VA_HOLE_START, in other >> words larger than 0x8000000000. >> >> Since vm_size is once more in the value range the MC expects it (without >> the hole) that should give us the right result to sign extend bit 47 >> into bits 48-63, e.g. 0xffff800000000. >> >> >> And yes that initially confused me as well. Especially that you need to >> program the MC and fill the page tables as if there wouldn't be a hole >> in the address space. >> >> Applying "& AMDGPU_VA_HOLE_MASK" and "| AMDGPU_VA_HOLE_END" just >> converts between the representation with and without the hole in it. > Thanks for the explanation, I get it now. > > For the benefit of future readers of this code, it would be nice to have > a short version of this explanation as a comment, e.g. above the > AMDGPU_VA_HOLE_* defines. I've added a comment and pushed the resulting patch with a CC: stable tag added. What about the libdrm side of the change? Anybody who could take a look? Thanks, Christian.