On Mon, Jun 03, 2013 at 01:50:46PM -0300, Rodrigo Vivi wrote: > On Mon, Jun 3, 2013 at 8:34 AM, Ville Syrj?l? > <ville.syrjala at linux.intel.com> wrote: > > On Fri, May 31, 2013 at 05:15:41PM -0300, Rodrigo Vivi wrote: > >> Hi Ville, > >> > >> Thanks for the comments. > >> > >> > >> On Fri, May 31, 2013 at 12:59 PM, Ville Syrj?l? > >> <ville.syrjala at linux.intel.com> wrote: > >> > On Tue, May 28, 2013 at 09:25:12PM -0300, Rodrigo Vivi wrote: > >> >> WaFbcNukeOn3DBlt for IVB, HSW and VLV. > >> > > >> > VLV doesn't have FBC, so this is a bit incorrect. > >> > >> I'm going to remove the vlv mention that incorrectly came from spec... > >> > >> > > >> >> > >> >> According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC. > >> >> Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that > >> >> follows each render submission." > >> >> > >> >> v2: Chris noticed that flush_domains check was missing here and also suggested to do > >> >> LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the > >> >> module parameter check. > >> >> > >> >> v3: Adding Wa name as Damien suggested. > >> >> > >> >> Cc: Chris Wilson <chris at chris-wilson.co.uk> > >> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi at gmail.com> > >> >> --- > >> >> drivers/gpu/drm/i915/i915_reg.h | 2 ++ > >> >> drivers/gpu/drm/i915/intel_pm.c | 2 +- > >> >> drivers/gpu/drm/i915/intel_ringbuffer.c | 32 ++++++++++++++++++++++++++++++++ > >> >> 3 files changed, 35 insertions(+), 1 deletion(-) > >> >> > >> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > >> >> index cc4c223..81ac584 100644 > >> >> --- a/drivers/gpu/drm/i915/i915_reg.h > >> >> +++ b/drivers/gpu/drm/i915/i915_reg.h > >> >> @@ -977,6 +977,8 @@ > >> >> /* Framebuffer compression for Ivybridge */ > >> >> #define IVB_FBC_RT_BASE 0x7020 > >> >> > >> >> +#define MSG_FBC_REND_STATE 0x50380 > >> >> +#define FBC_REND_NUKE (1<<2) > >> >> > >> >> #define _HSW_PIPE_SLICE_CHICKEN_1_A 0x420B0 > >> >> #define _HSW_PIPE_SLICE_CHICKEN_1_B 0x420B4 > >> >> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > >> >> index 1879188..e830a9b 100644 > >> >> --- a/drivers/gpu/drm/i915/intel_pm.c > >> >> +++ b/drivers/gpu/drm/i915/intel_pm.c > >> >> @@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval) > >> >> struct drm_i915_gem_object *obj = intel_fb->obj; > >> >> struct intel_crtc *intel_crtc = to_intel_crtc(crtc); > >> >> > >> >> - I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID); > >> >> + I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset); > >> >> > >> >> if (!intel_edp_is_psr_enabled(dev)) > >> >> I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X | > >> >> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > >> >> index 3d2c236..69491db 100644 > >> >> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > >> >> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > >> >> @@ -280,6 +280,30 @@ gen7_render_ring_cs_stall_wa(struct intel_ring_buffer *ring) > >> >> return 0; > >> >> } > >> >> > >> >> +static int gen7_ring_fbc_flush(struct intel_ring_buffer *ring) > >> >> +{ > >> >> + struct drm_device *dev = ring->dev; > >> >> + int ret; > >> >> + > >> >> + if (i915_enable_fbc == 0) > >> >> + return 0; > >> >> + > >> >> + if (i915_enable_fbc < 0 && !IS_HASWELL(dev)) > >> >> + return 0; > >> >> + > >> >> + ret = intel_ring_begin(ring, 4); > >> >> + if (ret) > >> >> + return ret; > >> >> + intel_ring_emit(ring, MI_NOOP); > >> >> + /* WaFbcNukeOn3DBlt:ivb/hsw/vlv */ > >> > > >> > Another mention of vlv. I can see BSpec makes the same mistake in > >> > the register description though. > >> > >> ... as you noticed. > >> > >> > > >> >> + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); > >> >> + intel_ring_emit(ring, MSG_FBC_REND_STATE); > >> >> + intel_ring_emit(ring, FBC_REND_NUKE); > >> >> + intel_ring_advance(ring); > >> >> + > >> >> + return 0; > >> >> +} > >> >> + > >> >> static int > >> >> gen7_render_ring_flush(struct intel_ring_buffer *ring, > >> >> u32 invalidate_domains, u32 flush_domains) > >> >> @@ -336,6 +360,9 @@ gen7_render_ring_flush(struct intel_ring_buffer *ring, > >> >> intel_ring_emit(ring, 0); > >> >> intel_ring_advance(ring); > >> >> > >> >> + if (flush_domains) > >> >> + return gen7_ring_fbc_flush(ring); > >> >> + > >> >> return 0; > >> >> } > >> >> > >> >> @@ -1623,6 +1650,7 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring, > >> >> static int blt_ring_flush(struct intel_ring_buffer *ring, > >> >> u32 invalidate, u32 flush) > >> >> { > >> >> + struct drm_device *dev = ring->dev; > >> >> uint32_t cmd; > >> >> int ret; > >> >> > >> >> @@ -1645,6 +1673,10 @@ static int blt_ring_flush(struct intel_ring_buffer *ring, > >> >> intel_ring_emit(ring, 0); > >> >> intel_ring_emit(ring, MI_NOOP); > >> >> intel_ring_advance(ring); > >> >> + > >> >> + if (IS_GEN7(dev)) > >> >> + return gen7_ring_fbc_flush(ring); > >> > > >> > Should check flush_domains here as well? > >> > >> here is flush domain by definition, isn' t it? > > > > How so? > > this function is the ring->flush. how is it possible to have it out of > flush domain? or are the names just confusing me? ->flush() takes as parameters invalidate and flush domain bitmasks. You already check for flush_domains in gen7_render_ring_flush(). If you're just invalidating, nothing gets flushed to the memory and hence the nuke procedure is not needed. > >> > > >> > So we're now using the same nuke mechanism from the blt ring too. > >> > Should we then drop the regular blitter tracking things from fbc_enable? > >> > >> This is a good question. Since this is a critical patch and it is > >> working in the way it is I prefer to let it in the way it is and > >> promisse that I will try to drop old blitter tracking for ivb and hsw > >> later. If it works I'll send the drop in another patch. > >> > >> > > >> > Oh and what about vcs and vecs, should we nuke from those rings as well? > >> > I guess it would be strange to write to the primary plane's buffer via > >> > vcs, but I'm assuming vebox could write the same formats that we can > >> > scan out... > >> > >> To be truly honest with you I have no idea about these case. specs > >> just says to put after every pipe_control following flush > >> renderings... and blt. > > > > IIRC the spec doesn't say anything about blt. > > > > Ah, the PM giude tells you to do LRIs w/ blt too. But it actually says > > that you should do "cache clean" LRIs insted of "nuke" LRIs. > > It is just the name, but same address and bit. It's not the same bit. Nuke is bit 2, cache clean is bit 1. > from BSPec: " Driver must program a MI_FLUSH_DW followed by a LRI into > the BCS ring to generate a cache clean message to FBC (LRI to offset > 0x50380 with data 0x00000002)." 0x2 == (1 << 1) aka. the cache clean bit, just like the text said. > > With this blit in the way it is I' m getting best rendering and power > saving performance. > > > > >> > >> > > >> >> + > >> >> return 0; > >> >> } > >> >> > >> >> -- > >> >> 1.8.1.4 > >> >> > >> >> _______________________________________________ > >> >> Intel-gfx mailing list > >> >> Intel-gfx at lists.freedesktop.org > >> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > >> > > >> > -- > >> > Ville Syrj?l? > >> > Intel OTC > >> > >> > >> > >> -- > >> Rodrigo Vivi > >> Blog: http://blog.vivi.eng.br > > > > -- > > Ville Syrj?l? > > Intel OTC > > > > -- > Rodrigo Vivi > Blog: http://blog.vivi.eng.br -- Ville Syrj?l? Intel OTC