> -----Original Message----- > From: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] On Behalf > Of Christian König > Sent: Friday, August 12, 2016 12:56 PM > To: Alex Deucher > Cc: Kasiviswanathan, Harish; amd-gfx list > Subject: Re: [PATCH 1/6] drm/amdgpu: write PTEs directly into the IB. > > Am 12.08.2016 um 17:46 schrieb Alex Deucher: > > On Fri, Aug 12, 2016 at 9:52 AM, Christian König > > <deathsimple at vodafone.de> wrote: > >> From: Christian König <christian.koenig at amd.com> > >> > >> Write the PTEs at the end of the IB instead of directly into the SDMA > commands. > >> This can save quite some CPU cycles building the entries. > >> > >> Signed-off-by: Christian König <christian.koenig at amd.com> > >> --- > >> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 26 > +++++++++++++++++++++----- > >> 1 file changed, 21 insertions(+), 5 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > >> index 2843132..7efcbe3 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > >> @@ -910,15 +910,15 @@ static int > amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, > >> /* padding, etc. */ > >> ndw = 64; > >> > >> - if (params.src) { > >> + if (src) { > >> /* only copy commands needed */ > >> ndw += ncmds * 7; > >> > >> - } else if (params.pages_addr) { > >> - /* header for write data commands */ > >> - ndw += ncmds * 4; > >> + } else if (pages_addr) { > >> + /* copy commands needed */ > >> + ndw += ncmds * 7; > >> > >> - /* body of write data command */ > >> + /* and also PTEs */ > >> ndw += nptes * 2; > >> > >> } else { > >> @@ -935,6 +935,22 @@ static int > amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev, > >> > >> params.ib = &job->ibs[0]; > >> > >> + if (!src && pages_addr) { > >> + uint64_t *pte; > >> + unsigned i; > >> + > >> + /* Put the PTEs at the end of the IB. */ > >> + i = ndw - nptes * 2; > >> + pte= (uint64_t *)&(job->ibs->ptr[i]); > >> + params.src = job->ibs->gpu_addr + i * 4; > > Is the offset correct for all asics? IIRC, ndw was kind of a worst > > case guess as the packet header sizes vary across families. > > Yeah, that should work, but I can double check once more. > > I actually don't change the dw estimation. Just instead of using the > inline write command I stitch together the page table entries at the end > of dw first and then use the copy command to move them over to the page > tables. Ah, I get it now. That's the piece I was missing. Can you add something like that to the commit message? With that, the series is: Reviewed-by: Alex Deucher <alexander.deucher at amd.com> > > That has the clear advantage of being way more cache friendly, because > you don't jump around between dithings any more. > > Christian. > > > > > Alex > > > >> + > >> + for (i = 0; i < nptes; ++i) { > >> + pte[i] = amdgpu_vm_map_gart(pages_addr, addr + i * > >> + AMDGPU_GPU_PAGE_SIZE); > >> + pte[i] |= flags; > >> + } > >> + } > >> + > >> r = amdgpu_sync_fence(adev, &job->sync, exclusive); > >> if (r) > >> goto error_free; > >> -- > >> 2.5.0 > >> > >> _______________________________________________ > >> amd-gfx mailing list > >> amd-gfx at lists.freedesktop.org > >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx