Op 28-10-15 om 18:20 schreef Ville Syrjälä: > On Wed, Oct 28, 2015 at 04:56:39PM +0000, Zanoni, Paulo R wrote: >> Em Ter, 2015-10-27 às 20:32 +0200, Ville Syrjälä escreveu: >>> On Tue, Oct 27, 2015 at 02:50:04PM -0200, Paulo Zanoni wrote: >>>> The hardware already takes care of disabling and recompressing FBC >>>> when we do a page flip, so all we need to do is to update the fence >>>> registers and move on. >>>> >>>> One of the important things to notice is that on the pre-gen6 >>>> platforms the fence is programmed on the FBC control register and >>>> the >>>> documentation says we can't update the control register while FBC >>>> is >>>> enabled. This would basically mean we'd have to to disable+enable >>>> FBC >>>> at every flip in order to make sure the hardware tracking still >>>> works, >>>> which doesn't seem to make too much sense. So I sent an email to >>>> the >>>> hardware team requesting some clarification. The information I got >>>> was >>>> this: >>>> >>>> "I don't think any hardware is latching on to 0x100100, 0x100104, >>>> or >>>> the old fence number in FBC_CTL. Instructions against changing on >>>> the >>>> fly would be to simplify testing and ensure you don't miss any >>>> invalidation that happened right at that time." >>>> >>>> So I guess we're fine for flips. But I can't really say I tested >>>> FBC >>>> on these ancient platforms - nor that I'll ever propose enabling >>>> FBC >>>> by default on them exactly because of problems like these. >>> Yeah, I did this in my patches too, and IIRC it seemed to work just >>> fine >>> back then. >>> >>>> v2: Add comment at flush() (Chris). >>>> >>>> Testcase: igt/kms_frontbuffer_tracking/fbc*-fliptrack >>>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@xxxxxxxxx> >>>> --- >>>> drivers/gpu/drm/i915/i915_drv.h | 1 + >>>> drivers/gpu/drm/i915/i915_reg.h | 3 + >>>> drivers/gpu/drm/i915/intel_display.c | 1 - >>>> drivers/gpu/drm/i915/intel_drv.h | 2 + >>>> drivers/gpu/drm/i915/intel_fbc.c | 103 >>>> ++++++++++++++++++++++++++++++- >>>> drivers/gpu/drm/i915/intel_frontbuffer.c | 1 + >>>> 6 files changed, 108 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h >>>> b/drivers/gpu/drm/i915/i915_drv.h >>>> index ee14962..77da500 100644 >>>> --- a/drivers/gpu/drm/i915/i915_drv.h >>>> +++ b/drivers/gpu/drm/i915/i915_drv.h >>>> @@ -928,6 +928,7 @@ struct i915_fbc { >>>> bool (*fbc_enabled)(struct drm_i915_private *dev_priv); >>>> void (*enable_fbc)(struct intel_crtc *crtc); >>>> void (*disable_fbc)(struct drm_i915_private *dev_priv); >>>> + void (*flip_prepare)(struct drm_i915_private *dev_priv); >>>> }; >>>> >>>> /** >>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h >>>> b/drivers/gpu/drm/i915/i915_reg.h >>>> index 8942532..3d598a2 100644 >>>> --- a/drivers/gpu/drm/i915/i915_reg.h >>>> +++ b/drivers/gpu/drm/i915/i915_reg.h >>>> @@ -2028,6 +2028,7 @@ enum skl_disp_power_wells { >>>> #define FBC_CTL_C3_IDLE (1<<13) >>>> #define FBC_CTL_STRIDE_SHIFT (5) >>>> #define FBC_CTL_FENCENO_SHIFT (0) >>>> +#define FBC_CTL_FENCENO_MASK 0xF >>> It's only three bits on gen2. But bit 3 is MBZ and there are only 8 >>> fence registers on those platforms, so this is OK. >>> >>>> #define FBC_COMMAND 0x0320c >>>> #define FBC_CMD_COMPRESS (1<<0) >>>> #define FBC_STATUS 0x03210 >>>> @@ -2064,6 +2065,7 @@ enum skl_disp_power_wells { >>>> #define DPFC_CTL_LIMIT_1X (0<<6) >>>> #define DPFC_CTL_LIMIT_2X (1<<6) >>>> #define DPFC_CTL_LIMIT_4X (2<<6) >>>> +#define DPFC_CTL_FENCE_MASK 0xF >>>> #define DPFC_RECOMP_CTL 0x320c >>>> #define DPFC_RECOMP_STALL_EN (1<<27) >>>> #define DPFC_RECOMP_STALL_WM_SHIFT (16) >>>> @@ -2086,6 +2088,7 @@ enum skl_disp_power_wells { >>>> #define FBC_CTL_FALSE_COLOR (1<<10) >>>> /* The bit 28-8 is reserved */ >>>> #define DPFC_RESERVED (0x1FFFFF00) >>>> +#define ILK_DPFC_FENCE_MASK 0xF >>>> #define ILK_DPFC_RECOMP_CTL 0x4320c >>>> #define ILK_DPFC_STATUS 0x43210 >>>> #define ILK_DPFC_FENCE_YOFF 0x43218 >>>> diff --git a/drivers/gpu/drm/i915/intel_display.c >>>> b/drivers/gpu/drm/i915/intel_display.c >>>> index bc1907e..d9378c3 100644 >>>> --- a/drivers/gpu/drm/i915/intel_display.c >>>> +++ b/drivers/gpu/drm/i915/intel_display.c >>>> @@ -11502,7 +11502,6 @@ static int intel_crtc_page_flip(struct >>>> drm_crtc *crtc, >>>> to_intel_plane(primary)- >>>>> frontbuffer_bit); >>>> mutex_unlock(&dev->struct_mutex); >>>> >>>> - intel_fbc_disable_crtc(intel_crtc); >>>> intel_frontbuffer_flip_prepare(dev, >>>> to_intel_plane(primary)- >>>>> frontbuffer_bit); >>>> >>>> diff --git a/drivers/gpu/drm/i915/intel_drv.h >>>> b/drivers/gpu/drm/i915/intel_drv.h >>>> index 08847d0..9065a8f 100644 >>>> --- a/drivers/gpu/drm/i915/intel_drv.h >>>> +++ b/drivers/gpu/drm/i915/intel_drv.h >>>> @@ -1305,6 +1305,8 @@ void intel_fbc_invalidate(struct >>>> drm_i915_private *dev_priv, >>>> enum fb_op_origin origin); >>>> void intel_fbc_flush(struct drm_i915_private *dev_priv, >>>> unsigned int frontbuffer_bits, enum >>>> fb_op_origin origin); >>>> +void intel_fbc_flip_prepare(struct drm_i915_private *dev_priv, >>>> + unsigned int frontbuffer_bits); >>>> void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv); >>>> >>>> /* intel_hdmi.c */ >>>> diff --git a/drivers/gpu/drm/i915/intel_fbc.c >>>> b/drivers/gpu/drm/i915/intel_fbc.c >>>> index d9d7e54..62f158b 100644 >>>> --- a/drivers/gpu/drm/i915/intel_fbc.c >>>> +++ b/drivers/gpu/drm/i915/intel_fbc.c >>>> @@ -82,6 +82,22 @@ static void i8xx_fbc_disable(struct >>>> drm_i915_private *dev_priv) >>>> DRM_DEBUG_KMS("disabled FBC\n"); >>>> } >>>> >>>> +static void i8xx_fbc_flip_prepare(struct drm_i915_private >>>> *dev_priv) >>>> +{ >>>> + struct intel_crtc *crtc = dev_priv->fbc.crtc; >>>> + struct drm_framebuffer *fb = crtc->base.primary->fb; >>>> + struct drm_i915_gem_object *obj = intel_fb_obj(fb); >>>> + uint32_t val; >>>> + >>>> + /* Although the documentation suggests we can't change >>>> DPFC_CONTROL >>>> + * while compression is enabled, the hardware guys said >>>> that updating >>>> + * the fence register bits during a flip is fine. */ >>>> + val = I915_READ(FBC_CONTROL); >>>> + val &= ~FBC_CTL_FENCENO_MASK; >>>> + val |= obj->fence_reg; >>>> + I915_WRITE(FBC_CONTROL, val); >>>> +} >>> IIRC I just called the enable function to reconstruct the entire >>> register contents to avoid code duplication. But maybe that's not >>> possible without reowrking the fbc state handling entirely >>> (which I had done). >>>> + >>>> static void i8xx_fbc_enable(struct intel_crtc *crtc) >>>> { >>>> struct drm_i915_private *dev_priv = crtc->base.dev- >>>>> dev_private; >>>> @@ -161,6 +177,22 @@ static void g4x_fbc_enable(struct intel_crtc >>>> *crtc) >>>> DRM_DEBUG_KMS("enabled fbc on plane %c\n", >>>> plane_name(crtc->plane)); >>>> } >>>> >>>> +static void g4x_fbc_flip_prepare(struct drm_i915_private >>>> *dev_priv) >>>> +{ >>>> + struct intel_crtc *crtc = dev_priv->fbc.crtc; >>>> + struct drm_framebuffer *fb = crtc->base.primary->fb; >>>> + struct drm_i915_gem_object *obj = intel_fb_obj(fb); >>>> + uint32_t val; >>>> + >>>> + /* Although the documentation suggests we can't change >>>> DPFC_CONTROL >>>> + * while compression is enabled, the hardware guys said >>>> that updating >>>> + * the fence register bits during a flip is fine. */ >>>> + val = I915_READ(DPFC_CONTROL); >>>> + val &= ~DPFC_CTL_FENCE_MASK; >>>> + val |= obj->fence_reg; >>>> + I915_WRITE(DPFC_CONTROL, val); >>>> +} >>>> + >>>> static void g4x_fbc_disable(struct drm_i915_private *dev_priv) >>>> { >>>> u32 dpfc_ctl; >>>> @@ -236,6 +268,31 @@ static void ilk_fbc_enable(struct intel_crtc >>>> *crtc) >>>> DRM_DEBUG_KMS("enabled fbc on plane %c\n", >>>> plane_name(crtc->plane)); >>>> } >>>> >>>> +static void ilk_fbc_flip_prepare(struct drm_i915_private >>>> *dev_priv) >>>> +{ >>>> + struct intel_crtc *crtc = dev_priv->fbc.crtc; >>>> + struct drm_framebuffer *fb = crtc->base.primary->fb; >>>> + struct drm_i915_gem_object *obj = intel_fb_obj(fb); >>>> + uint32_t val; >>>> + >>>> + /* Although the documentation suggests we can't change >>>> DPFC_CONTROL >>>> + * while compression is enabled, the hardware guys said >>>> that updating >>>> + * the fence register bits during a flip is fine. */ >>>> + val = I915_READ(ILK_DPFC_CONTROL); >>>> + val &= ~ILK_DPFC_FENCE_MASK; >>>> + val |= obj->fence_reg; >>>> + I915_WRITE(ILK_DPFC_CONTROL, val); >>>> +} >>>> + >>>> +static void snb_fbc_flip_prepare(struct drm_i915_private >>>> *dev_priv) >>>> +{ >>>> + struct intel_crtc *crtc = dev_priv->fbc.crtc; >>>> + struct drm_framebuffer *fb = crtc->base.primary->fb; >>>> + struct drm_i915_gem_object *obj = intel_fb_obj(fb); >>>> + >>>> + I915_WRITE(SNB_DPFC_CTL_SA, SNB_CPU_FENCE_ENABLE | obj- >>>>> fence_reg); >>>> +} >>>> + >>>> static void ilk_fbc_disable(struct drm_i915_private *dev_priv) >>>> { >>>> u32 dpfc_ctl; >>>> @@ -1021,13 +1078,48 @@ void intel_fbc_flush(struct >>>> drm_i915_private *dev_priv, >>>> if (origin == ORIGIN_GTT) >>>> return; >>>> >>>> + /* Hardware tracking already recompresses the CFB (nuke) >>>> for us if FBC >>>> + * is enabled and we do a page flip, so we can safely >>>> ignore it here. >>>> + * FBC may be disabled in case we got an invalidate() >>>> before the >>>> + * flush(), so we'll still have to check that case below. >>>> */ >>>> + if (origin == ORIGIN_FLIP && dev_priv->fbc.enabled) >>>> + return; >>>> + >>>> mutex_lock(&dev_priv->fbc.lock); >>>> >>>> dev_priv->fbc.busy_bits &= ~frontbuffer_bits; >>>> >>>> if (!dev_priv->fbc.busy_bits) { >>>> - __intel_fbc_disable(dev_priv); >>>> - __intel_fbc_update(dev_priv); >>>> + if (origin == ORIGIN_FLIP) { >>>> + __intel_fbc_update(dev_priv); >>>> + } else { >>>> + __intel_fbc_disable(dev_priv); >>>> + __intel_fbc_update(dev_priv); >>>> + } >>>> + } >>>> + >>>> + mutex_unlock(&dev_priv->fbc.lock); >>>> +} >>>> + >>>> +void intel_fbc_flip_prepare(struct drm_i915_private *dev_priv, >>>> + unsigned int frontbuffer_bits) >>>> +{ >>>> + unsigned int fbc_bits; >>>> + >>>> + if (!fbc_supported(dev_priv)) >>>> + return; >>>> + >>>> + mutex_lock(&dev_priv->fbc.lock); >>>> + >>>> + if (dev_priv->fbc.enabled) { >>>> + fbc_bits = INTEL_FRONTBUFFER_PRIMARY(dev_priv- >>>>> fbc.crtc->pipe); >>> primary->frontbuffer_bit would seem better. >> Why? > Why not? It's there already stashed away for you to use. No need to > sprinkle these INTEL_FRONTBUFFER... things all over the place. > >>>> + if (fbc_bits & frontbuffer_bits) >>>> + dev_priv->fbc.flip_prepare(dev_priv); >>> You would still have to disable+reenable if you need to eg. >>> reallocate >>> the compressed buffer, of if the new fb isn't suitable for fbc, or >>> maybe >>> if you need to change anything else in the control register. >> Yes, but as far as I understand the frontbuffer tracking, this case >> won't happen here. I'm assuming we'll always go through >> intel_fbc_update() on these cases. > Perhaps, but more by luck than by design. With the setplane and atomic > APIs there's no difference between a page flip and other plane updates. > So if you want to take advantage of the hw flip nuke, you would need to > check how the plane state is going to change, and based on that decide > whether a fence update is enough or disable+re-enable is needed. And in > fact I don't see any fbc_disable before the flip, just an fbc_update > after the fact. So seems to me the disable really should be in your > flip_prepare hook in this case. And in fact the whole flip_prepare > thing isn't even called from the setplane/atomic path. I'm working on fixing that, but that still requires a few more patches until page_flip and atomic_commit are unified. :-) If it's useful we could do a simple patch to do the same in atomic commit. I'm moving fb_bits to crtc_state anyway. ~Maarten _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx