On 18/11/17 03:31 PM, Christian König wrote: > Am 17.11.2017 um 17:09 schrieb Michel Dänzer: >> On 17/11/17 11:28 AM, Christian König wrote: >>> Ping? Michel, Alex can somebody take a look? >> Patch 2 is >> >> Reviewed-by: Michel Dänzer <michel.daenzer at amd.com> >> >> >> With patches 1 & 3, it's not 100% clear to me what the idea is behind >> the handling of the hole on the kernel and userspace side. Maybe you can >> add some explanation in code comments or the commit logs? > Yeah, that is actually a bit of a mess because the hardware > documentation wasn't very clear on how this works. > > How about this as extra code comment on patch 1 to the assignment of > dev_info.virtual_address_max: > > /* >  * Old userspace isn't aware of the VA hole on Vega10. So in theory an > client could get invalid VA addresses assigned. >  * To fix this and keep backward compatibility we limit the VA space > reported in this field to the range below the hole. >  */ > > The last patch is then to report the VA space above the hole, cause that > is actually what libdrm should use. > > The crux is when I put the VA space above the hole directly into the old > fields older versions of libdrm would break and we can't do that. That much was actually clear. :) Maybe some questions will expose what I'm missing: Patch 1: > @@ -880,14 +880,14 @@ static int amdgpu_cs_ib_vm_chunk(struct amdgpu_device *adev, > if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB) > continue; > > - r = amdgpu_cs_find_mapping(p, chunk_ib->va_start, > - &aobj, &m); > + va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK; > + r = amdgpu_cs_find_mapping(p, va_start, &aobj, &m); I don't understand how userspace can make use of the address space above the hole, since AMDGPU_VA_HOLE_MASK is 0x0000ffffffffffffULL and AMDGPU_VA_HOLE_END is 0xffff800000000000ULL, so masking with AMDGPU_VA_HOLE_MASK always results in an address below or inside the hole? > @@ -566,6 +566,17 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, > return -EINVAL; > } > > + if (args->va_address >= AMDGPU_VA_HOLE_START && > + args->va_address < AMDGPU_VA_HOLE_END) { > + dev_dbg(&dev->pdev->dev, > + "va_address 0x%LX is in VA hole 0x%LX-0x%LX\n", > + args->va_address, AMDGPU_VA_HOLE_START, > + AMDGPU_VA_HOLE_END); > + return -EINVAL; > + } > + > + args->va_address &= AMDGPU_VA_HOLE_MASK; Ignoring the above, I think the masking with AMDGPU_VA_HOLE_MASK would need to be done before checking for the hole, otherwise the masked address could be inside the hole? Patch 3: > @@ -577,10 +578,17 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file > dev_info.ids_flags |= AMDGPU_IDS_FLAGS_FUSION; > if (amdgpu_sriov_vf(adev)) > dev_info.ids_flags |= AMDGPU_IDS_FLAGS_PREEMPTION; > + > + vm_size = adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE; > dev_info.virtual_address_offset = AMDGPU_VA_RESERVED_SIZE; > dev_info.virtual_address_max = > - min(adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE, > - AMDGPU_VA_HOLE_START); > + min(vm_size, AMDGPU_VA_HOLE_START); > + > + vm_size -= AMDGPU_VA_RESERVED_SIZE; > + if (vm_size > AMDGPU_VA_HOLE_START) { > + dev_info.high_va_offset = AMDGPU_VA_HOLE_END; > + dev_info.high_va_max = AMDGPU_VA_HOLE_END | vm_size; > + } This mostly makes sense, except for the dev_info.high_va_max = AMDGPU_VA_HOLE_END | vm_size; line. Can you explain how simply or'ing together the address of the end of the hole and the VM size works correctly in all cases? As an extreme example, with vm_size == 1, this value would compute as 0xffff800000000001ULL, which seems wrong or at least weird. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer