Re: [PATCH] drm/i915/cmdparser: Limit clflush to active cachelines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 10, 2017 at 12:42:34PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes:
> 
> > On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote:
> >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes:
> >> 
> >> > On Fri, Mar 10, 2017 at 10:04:16AM +0000, Chris Wilson wrote:
> >> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote:
> >> >> > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes:
> >> >> > 
> >> >> > > We only need to clflush those cachelines that we have validated to be
> >> >> > > read by the GPU. Userspace typically fills the batch length in
> >> >> > > correctly, the exceptions tend to be explicit tests within igt.
> >> >> > >
> >> >> > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> >> >> > > ---
> >> >> > >  drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++-
> >> >> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >> >> > >
> >> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > index 21b1cd917d81..b9ce9a6881ea 100644
> >> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> >> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine,
> >> >> > >  	}
> >> >> > >  
> >> >> > >  	if (ret == 0 && needs_clflush_after)
> >> >> > > -		drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len);
> >> >> > > +		drm_clflush_virt_range(shadow_batch_obj->mm.mapping,
> >> >> > > +				       (void *)cmd - shadow_batch_obj->mm.mapping);
> >> >> > 
> >> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping)
> >> >> > 
> >> >> > We get away as the wb mapping being zero but for correctness.
> >> >> 
> >> >> The low bits of mm.mapping cannot change the cacheline and so doesn't
> >> >> affect the clflush. Same as before.
> >> >
> >> > Although, not quite the same as before as before we didn't use a fixed
> >> > end-point and so it could overshoot by a cacheline, even across a page
> >> > boundary.
> >> 
> >> And there could be PAGE_MASK worth of lowbits.
> >
> > There can? Wasn't anticipating adding a few thousand memory types, and
> > then sharing a single cache. :-p
> 
> #define ptr_mask_bits(ptr) ({
>  \
>          unsigned long __v = (unsigned long)(ptr);
>  \
>          (typeof(ptr))(__v & PAGE_MASK);
>  \
>  })                                                

Yes, I didn't name that function very well. The low bits of
obj->mm.mapping currently represent and index into the cache.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux