On Fri, Mar 10, 2017 at 12:42:34PM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > On Fri, Mar 10, 2017 at 12:26:10PM +0200, Mika Kuoppala wrote: > >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > >> > >> > On Fri, Mar 10, 2017 at 10:04:16AM +0000, Chris Wilson wrote: > >> >> On Fri, Mar 10, 2017 at 11:58:44AM +0200, Mika Kuoppala wrote: > >> >> > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > >> >> > > >> >> > > We only need to clflush those cachelines that we have validated to be > >> >> > > read by the GPU. Userspace typically fills the batch length in > >> >> > > correctly, the exceptions tend to be explicit tests within igt. > >> >> > > > >> >> > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> >> > > --- > >> >> > > drivers/gpu/drm/i915/i915_cmd_parser.c | 3 ++- > >> >> > > 1 file changed, 2 insertions(+), 1 deletion(-) > >> >> > > > >> >> > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c > >> >> > > index 21b1cd917d81..b9ce9a6881ea 100644 > >> >> > > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > >> >> > > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > >> >> > > @@ -1331,7 +1331,8 @@ int intel_engine_cmd_parser(struct intel_engine_cs *engine, > >> >> > > } > >> >> > > > >> >> > > if (ret == 0 && needs_clflush_after) > >> >> > > - drm_clflush_virt_range(shadow_batch_obj->mm.mapping, batch_len); > >> >> > > + drm_clflush_virt_range(shadow_batch_obj->mm.mapping, > >> >> > > + (void *)cmd - shadow_batch_obj->mm.mapping); > >> >> > > >> >> > (void *)cmd - ptr_mask_bit(shadow_batch_obj->mm.mapping) > >> >> > > >> >> > We get away as the wb mapping being zero but for correctness. > >> >> > >> >> The low bits of mm.mapping cannot change the cacheline and so doesn't > >> >> affect the clflush. Same as before. > >> > > >> > Although, not quite the same as before as before we didn't use a fixed > >> > end-point and so it could overshoot by a cacheline, even across a page > >> > boundary. > >> > >> And there could be PAGE_MASK worth of lowbits. > > > > There can? Wasn't anticipating adding a few thousand memory types, and > > then sharing a single cache. :-p > > #define ptr_mask_bits(ptr) ({ > \ > unsigned long __v = (unsigned long)(ptr); > \ > (typeof(ptr))(__v & PAGE_MASK); > \ > }) Yes, I didn't name that function very well. The low bits of obj->mm.mapping currently represent and index into the cache. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx