Am 23.06.2017 um 19:39 schrieb John Brooks: > When the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag is given by userspace, > it should only be treated as a hint to initially place a BO somewhere CPU > accessible, rather than having a permanent effect on BO placement. And that is a clear NAK from my side. CPU_ACCESS_REQUIRED is a permanent limitation to where the buffer should be placed. Ignoring that and changing the IOCTL interface like this is clearly not acceptable. Christian. > > Instead of the flag being set in stone at BO creation, set the flag when a > page fault occurs so that it goes somewhere CPU-visible, and clear it when > the BO is requested by the GPU. > > However, clearing the CPU_ACCESS_REQUIRED flag may move a BO to invisible > VRAM, which is likely to cause a page fault that moves it right back to > GTT. When this happens too much, it is highly detrimental to performance. > Only clear the flag on CS if: > > - The BO wasn't page faulted for a certain amount of time (currently 10 > seconds, measured with jiffies), and > - its last page fault didn't occur too soon (currently 500ms) after its > last CS request, or vice versa. > > Setting the flag in amdgpu_fault_reserve_notify() also means that we can > remove the loop to restrict lpfn to the end of visible VRAM, because > amdgpu_ttm_placement_init() will do it for us. > > Signed-off-by: John Brooks <john at fastquake.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 46 ++++++++++++++++++++++-------- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 + > 3 files changed, 36 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 2fad8bd..73d6882 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -320,6 +320,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p, > else > domain = bo->allowed_domains; > > + amdgpu_bo_clear_cpu_access_required(bo); > retry: > amdgpu_ttm_placement_from_domain(bo, domain); > initial_bytes_moved = atomic64_read(&adev->num_bytes_moved); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > index 31d1f21..a7d48a7 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > @@ -967,8 +967,8 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) > { > struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev); > struct amdgpu_bo *abo; > - unsigned long offset, size, lpfn; > - int i, r; > + unsigned long offset, size; > + int r; > > if (!amdgpu_ttm_bo_is_amdgpu_bo(bo)) > return 0; > @@ -991,18 +991,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) > > /* hurrah the memory is not visible ! */ > atomic64_inc(&adev->num_vram_cpu_page_faults); > + abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; > amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM | > AMDGPU_GEM_DOMAIN_GTT); > - lpfn = adev->mc.visible_vram_size >> PAGE_SHIFT; > - for (i = 0; i < abo->placement.num_placement; i++) { > - /* Try to move the BO into visible VRAM */ > - if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) && > - (!abo->placements[i].lpfn || > - abo->placements[i].lpfn > lpfn)) > - abo->placements[i].lpfn = lpfn; > - } > - abo->placement.busy_placement = abo->placement.placement; > - abo->placement.num_busy_placement = abo->placement.num_placement; > r = ttm_bo_validate(bo, &abo->placement, false, false); > if (unlikely(r != 0)) > return r; > @@ -1057,3 +1048,34 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo) > > return bo->tbo.offset; > } > + > +/** > + * amdgpu_bo_clear_cpu_access_required > + * @bo: BO to update > + * > + * Clears CPU_ACCESS_REQUIRED flag if the BO hasn't had a page fault in a while > + * and it didn't have a page fault too soon after the last time it was moved to > + * VRAM. > + * > + * Caller should have bo reserved. > + * > + */ > +void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo) > +{ > + const unsigned int page_fault_timeout_ms = 10000; > + const unsigned int min_period_ms = 500; > + unsigned int ms_since_pf, period_ms; > + > + ms_since_pf = jiffies_to_msecs(jiffies - bo->last_page_fault_jiffies); > + period_ms = jiffies_to_msecs(abs(bo->last_page_fault_jiffies - > + bo->last_cs_move_jiffies)); > + > + /* > + * Try to avoid a revolving door between GTT and VRAM. Clearing the > + * flag may move this BO back to VRAM, so don't clear it if it's likely > + * to page fault and go right back to GTT. > + */ > + if ((!bo->last_page_fault_jiffies || !bo->last_cs_move_jiffies) || > + (ms_since_pf > page_fault_timeout_ms && period_ms > min_period_ms)) > + bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > index 3824851..b0cb137 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h > @@ -182,6 +182,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev, > struct reservation_object *resv, > struct dma_fence **fence, > bool direct); > +void amdgpu_bo_clear_cpu_access_required(struct amdgpu_bo *bo); > > > /*