On Mon, Jun 26, 2017 at 06:38:29PM +0900, Michel Dänzer wrote: > On 24/06/17 02:39 AM, John Brooks wrote: > > There is no need for page faults to force BOs into visible VRAM if it's > > full, and the time it takes to do so is great enough to cause noticeable > > stuttering. Add GTT as a possible placement so that if visible VRAM is > > full, page faults move BOs to GTT instead of evicting other BOs from VRAM. > > > > Signed-off-by: John Brooks <john at fastquake.com> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 14 ++++++-------- > > 1 file changed, 6 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > > index 244a7d3..751bc05 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > > @@ -981,10 +981,11 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) > > > > /* hurrah the memory is not visible ! */ > > atomic64_inc(&adev->num_vram_cpu_page_faults); > > - amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM); > > + amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM | > > + AMDGPU_GEM_DOMAIN_GTT); > > lpfn = adev->mc.visible_vram_size >> PAGE_SHIFT; > > for (i = 0; i < abo->placement.num_placement; i++) { > > - /* Force into visible VRAM */ > > + /* Try to move the BO into visible VRAM */ > > if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) && > > (!abo->placements[i].lpfn || > > abo->placements[i].lpfn > lpfn)) > > @@ -993,16 +994,13 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) > > abo->placement.busy_placement = abo->placement.placement; > > abo->placement.num_busy_placement = abo->placement.num_placement; > > r = ttm_bo_validate(bo, &abo->placement, false, false); > > - if (unlikely(r == -ENOMEM)) { > > - amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT); > > - return ttm_bo_validate(bo, &abo->placement, false, false); > > - } else if (unlikely(r != 0)) { > > + if (unlikely(r != 0)) > > return r; > > - } > > > > offset = bo->mem.start << PAGE_SHIFT; > > /* this should never happen */ > > - if ((offset + size) > adev->mc.visible_vram_size) > > + if (bo->mem.mem_type == TTM_PL_VRAM && > > + (offset + size) > adev->mc.visible_vram_size) > > return -EINVAL; > > > > return 0; > > > > AFAICT this is essentially the same as > https://patchwork.freedesktop.org/patch/156993/ . I retracted that patch > due to it causing Rally Dirt performance degradation for Marek. > Presumably other patches in this series compensate for that, but at the > least this patch should be moved to the end of the series. Yeah, it did end up pretty much the same as your patch :p The reason this causes the performance degredation is that if a page fault moves a BO to GTT, and it's *not* marked CPU_ACCESS_REQUIRED, then CS will move it to invisible VRAM. And then another page fault happens. And this repeats fast enough that the BO is constantly moving back and forth between GTT and VRAM. Patch 6 (Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS) takes care of it by not allowing BOs to go to invisible VRAM until after a timeout and only if it didn't have a page fault too soon after the last time it was moved to VRAM. This was implemented by setting the CPU_ACCESS_REQUIRED flag on page faults and not clearing it unless those conditions are met. It doesn't necessarily need to involve the flag, depending on where we decide to go with that. John > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer