Am 28.06.2017 um 04:33 schrieb John Brooks: > When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace, > it should only be treated as a hint to initially place a BO somewhere CPU > accessible, rather than having a permanent effect on BO placement. > > Instead of the flag being set in stone at BO creation, set the flag when a > page fault occurs so that it goes somewhere CPU-visible, and clear it when > the BO is requested by the GPU. > > However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to > invisible VRAM, where they may promptly generate another page fault. When > BOs are constantly moved back and forth like this, it is highly detrimental > to performance. Only clear the flag on CS if: > > - The BO wasn't page faulted for a certain amount of time (currently 10 > seconds), and > - its last page fault didn't occur too soon (currently 500ms) after its > last CS request, or vice versa. > > Setting the flag in amdgpu_fault_reserve_notify() also means that we can > remove the loop to restrict lpfn to the end of visible VRAM, because > amdgpu_ttm_placement_init() will do it for us. I'm fine with the general approach, but I'm still absolutely not keen about clearing the flag when userspace has originally specified it. Please add a new flag something like AMDGPU_GEM_CREATE_CPU_ACCESS_HINT or something like this. Regards, Christian. > > Signed-off-by: John Brooks <john at fastquake.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 45 +++++++++++++++++++++++------- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 + > 3 files changed, 38 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 071b592..1ac3c2e 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -341,6 +341,8 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p, > if (bo->pin_count) > return 0; > > + amdgpu_bo_clear_cpu_access_required(bo); > + > /* Don't move this buffer if we have depleted our allowance > * to move it. Don't move anything if the threshold is zero. > */ > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > index b71775c..658d7b1 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > @@ -943,8 +943,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) > { > struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev); > struct amdgpu_bo *abo; > - unsigned long offset, size, lpfn; > - int i, r; > + unsigned long offset, size; > + int r; > > if (!amdgpu_ttm_bo_is_amdgpu_bo(bo)) > return 0; > @@ -967,15 +967,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) > > /* hurrah the memory is not visible ! */ > atomic64_inc(&adev->num_vram_cpu_page_faults); > + abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; > amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM); > - lpfn = adev->mc.visible_vram_size >> PAGE_SHIFT; > - for (i = 0; i < abo->placement.num_placement; i++) { > - /* Force into visible VRAM */ > - if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) && > - (!abo->placements[i].lpfn || > - abo->placements[i].lpfn > lpfn)) > - abo->placements[i].lpfn = lpfn; > - } > + > r = ttm_bo_validate(bo, &abo->placement, false, false); > if (unlikely(r == -ENOMEM)) { > amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT); > @@ -1033,3 +1027,34 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo) > > return bo->tbo.offset; > } > + > +/** > + * amdgpu_bo_clear_cpu_access_required > + * @bo: BO to update > + * > + * Clears CPU_ACCESS_REQUIRED flag if the BO hasn't had a page fault in a while > + * and it didn't have a page fault too soon after the last time it was moved to > + * VRAM. > + * > + * Caller should have bo reserved. > + * > + */ > +void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo) > +{ > + const unsigned int page_fault_timeout_ms = 10000; > + const unsigned int min_period_ms = 500; > + unsigned int ms_since_pf, period_ms; > + > + ms_since_pf = jiffies_to_msecs(jiffies - bo->last_page_fault_jiffies); > + period_ms = jiffies_to_msecs(abs(bo->last_page_fault_jiffies - > + bo->last_cs_move_jiffies)); > + > + /* > + * Try to avoid a revolving door between GTT and VRAM. Clearing the > + * flag may move this BO back to VRAM, so don't clear it if it's likely > + * to page fault and go right back to GTT. > + */ > + if ((!bo->last_page_fault_jiffies || !bo->last_cs_move_jiffies) || > + (ms_since_pf > page_fault_timeout_ms && period_ms > min_period_ms)) > + bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > index 3824851..b0cb137 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > @@ -182,6 +182,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev, > struct reservation_object *resv, > struct dma_fence **fence, > bool direct); > +void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo); > > > /*