On 2017-09-22 08:13 AM, Christian König wrote: > From: Christian König <christian.koenig at amd.com> > > Convert GTT mappings into linear ones for huge page handling. > > Signed-off-by: Christian König <christian.koenig at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 24 +++++++++++++++++++++--- > 1 file changed, 21 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > index 3c86381..16454bd 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > @@ -1732,6 +1732,7 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev, > } > > do { > + dma_addr_t *dma_addr = NULL; > uint64_t max_entries; > uint64_t addr, last; > > @@ -1745,15 +1746,32 @@ static int amdgpu_vm_bo_split_mapping(struct amdgpu_device *adev, > } > > if (pages_addr) { > + uint64_t count; > + > max_entries = min(max_entries, 16ull * 1024ull); > - addr = 0; > + for (count = 1; count < max_entries; ++count) { > + uint64_t idx = pfn + count; > + > + if (pages_addr[idx] != > + (pages_addr[idx - 1] + PAGE_SIZE)) > + break; > + } > + > + if (count < 64) { Is 64 just an arbitrary number? As I understand it, this determines the smallest contiguous allocation you can take advantage of. It would correspond to a 256KB fragment size. I suppose this is a trade-off between TLB efficiency and the number of calls to amdgpu_vm_bo_update_mapping. Any particular reason for picking 64? To me the obvious choices would be 16 (64KB fragment) or 512 (2MB huge page). I'm not saying it's wrong. It's just kind of arbitrary and could use a comment. Other than that, this patch is Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>. Regards,  Felix > + addr = pfn << PAGE_SHIFT; > + dma_addr = pages_addr; > + } else { > + addr = pages_addr[pfn]; > + max_entries = count; > + } > + > } else if (flags & AMDGPU_PTE_VALID) { > addr += adev->vm_manager.vram_base_offset; > + addr += pfn << PAGE_SHIFT; > } > - addr += pfn << PAGE_SHIFT; > > last = min((uint64_t)mapping->last, start + max_entries - 1); > - r = amdgpu_vm_bo_update_mapping(adev, exclusive, pages_addr, vm, > + r = amdgpu_vm_bo_update_mapping(adev, exclusive, dma_addr, vm, > start, last, flags, addr, > fence); > if (r)