[PATCH 3/9] drm/amdgpu: Don't force BOs into visible VRAM for page faults

john@xxxxxxxxxxxxx (John Brooks) · Tue, 27 Jun 2017 03:25:27 +0000

On Mon, Jun 26, 2017 at 06:38:29PM +0900, Michel DÃ¤nzer wrote:
> On 24/06/17 02:39 AM, John Brooks wrote:
> > There is no need for page faults to force BOs into visible VRAM if it's
> > full, and the time it takes to do so is great enough to cause noticeable
> > stuttering. Add GTT as a possible placement so that if visible VRAM is
> > full, page faults move BOs to GTT instead of evicting other BOs from VRAM.
> > 
> > Signed-off-by: John Brooks <john at fastquake.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 14 ++++++--------
> >  1 file changed, 6 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > index 244a7d3..751bc05 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > @@ -981,10 +981,11 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
> >  
> >  	/* hurrah the memory is not visible ! */
> >  	atomic64_inc(&adev->num_vram_cpu_page_faults);
> > -	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
> > +	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM |
> > +					 AMDGPU_GEM_DOMAIN_GTT);
> >  	lpfn =	adev->mc.visible_vram_size >> PAGE_SHIFT;
> >  	for (i = 0; i < abo->placement.num_placement; i++) {
> > -		/* Force into visible VRAM */
> > +		/* Try to move the BO into visible VRAM */
> >  		if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
> >  		    (!abo->placements[i].lpfn ||
> >  		     abo->placements[i].lpfn > lpfn))
> > @@ -993,16 +994,13 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
> >  	abo->placement.busy_placement = abo->placement.placement;
> >  	abo->placement.num_busy_placement = abo->placement.num_placement;
> >  	r = ttm_bo_validate(bo, &abo->placement, false, false);
> > -	if (unlikely(r == -ENOMEM)) {
> > -		amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
> > -		return ttm_bo_validate(bo, &abo->placement, false, false);
> > -	} else if (unlikely(r != 0)) {
> > +	if (unlikely(r != 0))
> >  		return r;
> > -	}
> >  
> >  	offset = bo->mem.start << PAGE_SHIFT;
> >  	/* this should never happen */
> > -	if ((offset + size) > adev->mc.visible_vram_size)
> > +	if (bo->mem.mem_type == TTM_PL_VRAM &&
> > +	    (offset + size) > adev->mc.visible_vram_size)
> >  		return -EINVAL;
> >  
> >  	return 0;
> > 
> 
> AFAICT this is essentially the same as
> https://patchwork.freedesktop.org/patch/156993/ . I retracted that patch
> due to it causing Rally Dirt performance degradation for Marek.
> Presumably other patches in this series compensate for that, but at the
> least this patch should be moved to the end of the series.

Yeah, it did end up pretty much the same as your patch :p

The reason this causes the performance degredation is that if a page fault
moves a BO to GTT, and it's *not* marked CPU_ACCESS_REQUIRED, then CS will move
it to invisible VRAM. And then another page fault happens. And this repeats
fast enough that the BO is constantly moving back and forth between GTT and
VRAM.

Patch 6 (Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS) takes care of
it by not allowing BOs to go to invisible VRAM until after a timeout and only
if it didn't have a page fault too soon after the last time it was moved to
VRAM. This was implemented by setting the CPU_ACCESS_REQUIRED flag on page
faults and not clearing it unless those conditions are met. It doesn't
necessarily need to involve the flag, depending on where we decide to go with
that.

John

> 
> 
> -- 
> Earthling Michel DÃ¤nzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer