Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > Sometimes Icelake forgets to reset the CSB pointers on a GPU reset, > leading to carry on updating the old tail of the buffer. > > <0>[ 618.138490] i915_sel-5636 3d..1 673425465us : trace_ports: vecs0: submit { 14de2:504, 0:0 } > <0>[ 618.138490] i915_sel-5636 3.... 673425493us : intel_engine_reset: vecs0 flags=100 > <0>[ 618.138490] i915_sel-5636 3.... 673425493us : execlists_reset_prepare: vecs0: depth<-0 > <0>[ 618.138490] i915_sel-5636 3.... 673425493us : intel_engine_stop_cs: vecs0 > <0>[ 618.138490] i915_sel-5636 3.... 673425523us : __intel_gt_reset: engine_mask=40 > <0>[ 618.138490] i915_sel-5636 3.... 673425568us : execlists_reset: vecs0 > <0>[ 618.138490] i915_sel-5636 3d..1 673425568us : process_csb: vecs0 cs-irq head=1, tail=2 > <0>[ 618.138490] i915_sel-5636 3d..1 673425568us : process_csb: vecs0 csb[2]: status=0x00000001:0x40000000 > <0>[ 618.138490] i915_sel-5636 3d..1 673425569us : trace_ports: vecs0: promote { 14de2:504*, 0:0 } > <0>[ 618.138490] i915_sel-5636 3d..1 673425570us : __i915_request_reset: vecs0 rq=14de2:504, guilty? yes > <0>[ 618.138490] i915_sel-5636 3d..1 673425571us : __execlists_reset: vecs0 replay {head:2de0, tail:2e48} > <0>[ 618.138490] i915_sel-5636 3d..1 673425572us : __i915_request_unsubmit: vecs0 fence 14de2:504, current 503 > <0>[ 618.138490] i915_sel-5636 3.... 673435544us : intel_engine_cancel_stop_cs: vecs0 > <0>[ 618.138490] i915_sel-5636 3.... 673435544us : process_csb: vecs0 cs-irq head=11, tail=11 > <0>[ 618.138490] i915_sel-5636 3d..1 673435545us : __i915_request_submit: vecs0 fence 14de2:504, current 503 > <0>[ 618.138490] i915_sel-5636 3d..1 673435546us : __execlists_submission_tasklet: vecs0: queue_priority_hint:-2147483648, submit:yes > <0>[ 618.138490] i915_sel-5636 3d..1 673435548us : trace_ports: vecs0: submit { 14de2:504*, 0:0 } > <0>[ 618.138490] i915_sel-5636 3.... 673435549us : execlists_reset_finish: vecs0: depth->0 > <0>[ 618.138490] ksoftirq-21 2..s. 673435592us : process_csb: vecs0 cs-irq head=11, tail=3 > <0>[ 618.138490] ksoftirq-21 2..s. 673435593us : process_csb: vecs0 csb[0]: status=0x00000001:0x40000000 > <0>[ 618.138490] ksoftirq-21 2..s. 673435594us : trace_ports: vecs0: promote { 14de2:504*, 0:0 } > <0>[ 618.138490] ksoftirq-21 2..s. 673435596us : process_csb: vecs0 csb[1]: status=0x00000018:0x40000040 > <0>[ 618.138490] ksoftirq-21 2..s. 673435597us : trace_ports: vecs0: completed { 14de2:504*, 0:0 } > <0>[ 618.138490] ksoftirq-21 2..s. 673435612us : process_csb: process_csb:2188 GEM_BUG_ON(!i915_request_completed(*execlists->active) && !reset_in_progress(execlists)) > > After the reset, we do another clflush before checking the CSB to be > sure we see whatever was left in the CSB prior to the reset. So it is > unlikely to be an incoherent view of the CSB, and more likely that > Icelake didn't reset its pointers. > > References: 582a6f90aa0d ("drm/i915/execlists: Add a paranoid flush of the CSB pointers upon reset") > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/gt/intel_lrc.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > index 8d79a965f341..b869a9c7b3c6 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -2996,6 +2996,14 @@ static void reset_csb_pointers(struct intel_engine_cs *engine) > WRITE_ONCE(*execlists->csb_write, reset_value); > wmb(); /* Make sure this is visible to HW (paranoia?) */ > > + /* > + * Sometimes Icelakes forgets to reset its pointers on a GPU reset. > + * Bludgeon them with a mmio update to be sure. > + */ > + ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR, > + reset_value << 8 | reset_value); > + ENGINE_POSTING_READ(engine, RING_CONTEXT_STATUS_PTR); > + Big hammer indeed. Next one is that we assert the reset value and check that first one after reset comes out with 0x0? Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > invalidate_csb_entries(&execlists->csb_status[0], > &execlists->csb_status[reset_value]); > } > -- > 2.24.0.rc2 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx