Quoting Michel Thierry (2018-04-27 21:12:38) > On 4/27/2018 12:32 PM, Chris Wilson wrote: > > Previously, we just reset the ring register in the context image such > > that we could skip over the broken batch and emit the closing > > breadcrumb. However, on resume the context image and GPU state would be > > reloaded, which may have been left in an inconsistent state by the > > reset. The presumption was that at worst it would just cause another > > reset and skip again until it recovered, however it seems just as likely > > to cause an unrecoverable hang. Instead of risking loading an incomplete > > context image, restore it back to the default state. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > > Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx> > > Cc: Michel Thierry <michel.thierry@xxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/intel_lrc.c | 24 +++++++++++++++++------- > > 1 file changed, 17 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > > index ce23d5116482..422b05290ed6 100644 > > --- a/drivers/gpu/drm/i915/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/intel_lrc.c > > @@ -1804,8 +1804,8 @@ static void reset_common_ring(struct intel_engine_cs *engine, > > struct i915_request *request) > > { > > struct intel_engine_execlists * const execlists = &engine->execlists; > > - struct intel_context *ce; > > unsigned long flags; > > + u32 *regs; > > > > GEM_TRACE("%s request global=%x, current=%d\n", > > engine->name, request ? request->global_seqno : 0, > > @@ -1855,14 +1855,24 @@ static void reset_common_ring(struct intel_engine_cs *engine, > > * future request will be after userspace has had the opportunity > > * to recreate its own state. > > */ > > - ce = &request->ctx->engine[engine->id]; > > - execlists_init_reg_state(ce->lrc_reg_state, > > - request->ctx, engine, ce->ring); > > + regs = request->ctx->engine[engine->id].lrc_reg_state; > > + if (engine->default_state) { > > + void *defaults; > > + > > + defaults = i915_gem_object_pin_map(engine->default_state, > > + I915_MAP_WB); > > + if (!IS_ERR(defaults)) { > > + memcpy(regs, > > + defaults + LRC_HEADER_PAGES * PAGE_SIZE, > > + engine->context_size); > Hi, > > The context_size is taking into count the PP_HWSP page, do we also need > to rewrite the PP_HSWP? (or just the logical state). > > Also regs is already pointing to the start of the logical state > (vaddr + LRC_STATE_PN * PAGE_SIZE). Yeah, I was aiming for just the register state, and had a nice little off by one in comparing the macros. > So if we want to overwrite from the PP_HWSP, then regs is not the right > offset, or if we only want to change the logical state then it should be > from 'defaults + LRC_STATE_PN * PAGE_SIZE'. Right, I don't think we need to scrub the HWSP, just the register state. The context is lost at this point, and what I want to protect is the read of the image following the reset. Afaik, we don't issue any reads from PPHWSP. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx