On Thu, Nov 23, 2017 at 08:47:45PM +0000, Chris Wilson wrote: > Ironlake does support being able to saving and reloading context specific > registers between contexts, providing isolation of the basic GPU state > (as programmable by userspace). This allows userspace to assume that the > GPU retains their state from one batch to the next, minimising the > amount of state it needs to reload. > > v2: Fix off-by-one in reading CXT_SIZE, and add a comment that the > CXT_SIZE and context-layout do not match in bspec, but the difference is > irrelevant as we overallocate the full page anyway (Ville). > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/intel_engine_cs.c | 16 ++++++++++++++++ > drivers/gpu/drm/i915/intel_ringbuffer.c | 13 +++++++++++++ > 2 files changed, 29 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c > index fede62daf3e1..5b99125a179b 100644 > --- a/drivers/gpu/drm/i915/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c > @@ -175,6 +175,22 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class) > return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64, > PAGE_SIZE); > case 5: > + /* > + * There is a discrepancy here between the size reported > + * by the register and the size of the context layout > + * in the docs. Both are described as authorative! > + * > + * The discrepancy is on the order of a few cachelines, > + * but the total is under one page (4k), which is our > + * minimum allocation anyway so it should all come > + * out in the wash. > + */ > + cxt_size = I915_READ(CXT_SIZE) + 1; > + DRM_DEBUG_DRIVER("gen%d CXT_SIZE = %d bytes [0x%08x]\n", > + INTEL_GEN(dev_priv), > + cxt_size * 64, > + cxt_size - 1); > + return round_up(cxt_size * 64, PAGE_SIZE); > case 4: > case 3: > case 2: > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index e2085820b586..2074749b27a5 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -1403,11 +1403,14 @@ static inline int mi_set_context(struct drm_i915_gem_request *rq, u32 flags) > /* These flags are for resource streamer on HSW+ */ > flags |= HSW_MI_RS_SAVE_STATE_EN | HSW_MI_RS_RESTORE_STATE_EN; > else > + /* We need to save the extended state for powersaving modes */ > flags |= MI_SAVE_EXT_STATE_EN | MI_RESTORE_EXT_STATE_EN; > > len = 4; > if (IS_GEN7(i915)) > len += 2 + (num_rings ? 4*num_rings + 6 : 0); > + if (IS_GEN5(i915)) > + len += 2; > > cs = intel_ring_begin(rq, len); > if (IS_ERR(cs)) > @@ -1430,6 +1433,14 @@ static inline int mi_set_context(struct drm_i915_gem_request *rq, u32 flags) > GEN6_PSMI_SLEEP_MSG_DISABLE); > } > } > + } else if (IS_GEN5(i915)) { > + /* > + * This w/a is only listed for pre-production ilk a/b steppings, > + * but is also mentioned for programming the powerctx. To be > + * safe, just apply the workaround; we do not use SyncFlush so > + * this should never take effect and so be a no-op! > + */ > + *cs++ = MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN; Maybe also toss in the name from the w/a db? Reviewed-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > } > > *cs++ = MI_NOOP; > @@ -1464,6 +1475,8 @@ static inline int mi_set_context(struct drm_i915_gem_request *rq, u32 flags) > *cs++ = MI_NOOP; > } > *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE; > + } else if (IS_GEN5(i915)) { > + *cs++ = MI_SUSPEND_FLUSH; > } > > intel_ring_advance(rq, cs); > -- > 2.15.0 -- Ville Syrjälä Intel OTC _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx