Quoting Mika Kuoppala (2019-09-12 09:27:56) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > Quoting Mika Kuoppala (2019-09-12 08:51:38) > >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > >> > >> > After a GPU reset, we need to drain all the CS events so that we have an > >> > accurate picture of the execlists state at the time of the reset. Be > >> > paranoid and force a read of the CSB write pointer from memory. > >> > > >> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > >> > --- > >> > drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++++ > >> > 1 file changed, 4 insertions(+) > >> > > >> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > >> > index 3d83c7e0d9de..61a38a4ccbca 100644 > >> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > >> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > >> > @@ -2836,6 +2836,10 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled) > >> > struct i915_request *rq; > >> > u32 *regs; > >> > > >> > + mb(); /* paranoia: read the CSB pointers from after the reset */ > >> > + clflush(execlists->csb_write); > >> > + mb(); > >> > + > >> > >> We know there is always a cost. We do invalidate the csb > >> on each pass on process_csb. > >> > >> Add csb_write in to invalidate_csb entries along > >> with mbs. Rename it to invalidate_csb and use it > >> always? > >> > >> By doing so, we could prolly throw out the rmb() at > >> the start of the process_csb as we would have invalidated > >> the write pointer along with the entries we read, > >> on previous pass. > > > > No. That rmb is essential for the read ordering at that moment in time. > > Ah yes indeed it is. head vs entries coherency. > > > > > All I have in mind here is a delay, not really a barrier per se, just > > this is a nice way of saying no speculation either. > > Forgetting the rmb(), there is similar pattern of mb()+flush > elsewhere. Just saw the profiliferation and opportunity to converge. I understood. I think your barrier-less w/a works pretty well and I haven't yet poked a hole in how I think it works ;) > But syncing with the hardware on moment of reset, this should > do. I looked at reusing invalidate_csb_entries() and I think the key part here is that we do want to invalidate the execlists->csb_write itself, so a subtly different location/reason (not sure if it's the same cacheline or the neighbouring one). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx