On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote: > Am 28.06.2017 um 04:33 schrieb John Brooks: > >When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace, > >it should only be treated as a hint to initially place a BO somewhere CPU > >accessible, rather than having a permanent effect on BO placement. > > > >Instead of the flag being set in stone at BO creation, set the flag when a > >page fault occurs so that it goes somewhere CPU-visible, and clear it when > >the BO is requested by the GPU. > > > >However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to > >invisible VRAM, where they may promptly generate another page fault. When > >BOs are constantly moved back and forth like this, it is highly detrimental > >to performance. Only clear the flag on CS if: > > > >- The BO wasn't page faulted for a certain amount of time (currently 10 > > seconds), and > >- its last page fault didn't occur too soon (currently 500ms) after its > > last CS request, or vice versa. > > > >Setting the flag in amdgpu_fault_reserve_notify() also means that we can > >remove the loop to restrict lpfn to the end of visible VRAM, because > >amdgpu_ttm_placement_init() will do it for us. > > I'm fine with the general approach, but I'm still absolutely not keen about > clearing the flag when userspace has originally specified it. > > Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT or > something like this. > > Regards, > Christian. Is it the fact that we clear a flag that userspace expects not to have changed if it queries it later? I think that's the only effect of this that's directly visible to userspace code. Would it be permissible to change the handling of the flag without clearing it? As for a new "hint" flag, I assume this new flag would be an alternative to the existing CPU_ACCESS_REQUIRED flag, and we'd change Mesa et al to use it in situations where the kernel *should* place a BO somewhere CPU-accessible, but is permitted to move it elsewhere. Is that correct? John > > > > >Signed-off-by: John Brooks <john at fastquake.com> > >--- > > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 45 +++++++++++++++++++++++------- > > drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 + > > 3 files changed, 38 insertions(+), 10 deletions(-) > > > >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >index 071b592..1ac3c2e 100644 > >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > >@@ -341,6 +341,8 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p, > > if (bo->pin_count) > > return 0; > >+ amdgpu_bo_clear_cpu_access_required(bo); > >+ > > /* Don't move this buffer if we have depleted our allowance > > * to move it. Don't move anything if the threshold is zero. > > */ > >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > >index b71775c..658d7b1 100644 > >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > >@@ -943,8 +943,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) > > { > > struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev); > > struct amdgpu_bo *abo; > >- unsigned long offset, size, lpfn; > >- int i, r; > >+ unsigned long offset, size; > >+ int r; > > if (!amdgpu_ttm_bo_is_amdgpu_bo(bo)) > > return 0; > >@@ -967,15 +967,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) > > /* hurrah the memory is not visible ! */ > > atomic64_inc(&adev->num_vram_cpu_page_faults); > >+ abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; > > amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM); > >- lpfn = adev->mc.visible_vram_size >> PAGE_SHIFT; > >- for (i = 0; i < abo->placement.num_placement; i++) { > >- /* Force into visible VRAM */ > >- if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) && > >- (!abo->placements[i].lpfn || > >- abo->placements[i].lpfn > lpfn)) > >- abo->placements[i].lpfn = lpfn; > >- } > >+ > > r = ttm_bo_validate(bo, &abo->placement, false, false); > > if (unlikely(r == -ENOMEM)) { > > amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT); > >@@ -1033,3 +1027,34 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo) > > return bo->tbo.offset; > > } > >+ > >+/** > >+ * amdgpu_bo_clear_cpu_access_required > >+ * @bo: BO to update > >+ * > >+ * Clears CPU_ACCESS_REQUIRED flag if the BO hasn't had a page fault in a while > >+ * and it didn't have a page fault too soon after the last time it was moved to > >+ * VRAM. > >+ * > >+ * Caller should have bo reserved. > >+ * > >+ */ > >+void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo) > >+{ > >+ const unsigned int page_fault_timeout_ms = 10000; > >+ const unsigned int min_period_ms = 500; > >+ unsigned int ms_since_pf, period_ms; > >+ > >+ ms_since_pf = jiffies_to_msecs(jiffies - bo->last_page_fault_jiffies); > >+ period_ms = jiffies_to_msecs(abs(bo->last_page_fault_jiffies - > >+ bo->last_cs_move_jiffies)); > >+ > >+ /* > >+ * Try to avoid a revolving door between GTT and VRAM. Clearing the > >+ * flag may move this BO back to VRAM, so don't clear it if it's likely > >+ * to page fault and go right back to GTT. > >+ */ > >+ if ((!bo->last_page_fault_jiffies || !bo->last_cs_move_jiffies) || > >+ (ms_since_pf > page_fault_timeout_ms && period_ms > min_period_ms)) > >+ bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; > >+} > >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > >index 3824851..b0cb137 100644 > >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > >@@ -182,6 +182,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev, > > struct reservation_object *resv, > > struct dma_fence **fence, > > bool direct); > >+void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo); > > /* > >