[PATCH 30/35] drm/i915: Replace the ILK/SNB/IVB/HSW watermark code

ville.syrjala at linux.intel.com (Ville Syrjälä) · Fri, 5 Jul 2013 13:49:01 +0300

On Fri, Jul 05, 2013 at 10:37:08AM +0100, Chris Wilson wrote:
> On Fri, Jul 05, 2013 at 11:57:42AM +0300, ville.syrjala at linux.intel.com wrote:
> > From: Ville Syrj?l? <ville.syrjala at linux.intel.com>
> > 
> > There is a major problem with the watermark registers; they're not
> > double buffered. So we need to make sure we update them at the correct
> > time when messing about with planes. The correct time is the beginning
> > of vblank.
> > 
> > So when we determine that the watermarks need to updated hand in hand
> > with the next vblank, we store the pre-computed watermarks under
> > intel_crtc, and when the vblank happens, we promote the pending
> > watermarks to active status.
> > 
> > on HSW when the watermarks for any pipe change, we must merge the
> > watermarks from all pipes so that we can determine the correct LP1+
> > watermark levels. For simplicity we follow the same codepaths for
> > pre-HSW hardware as well, but there all the LP1+ watermarks will be
> > disabled when multiple pipes are enabled. Once the watermarks are
> > merged we check them for validity, disabling any invalid levels.
> > 
> > Touching the watermark registers causes the hardware to re-evaluate the
> > watermarks, which expeds some power. So after merging the watermarks
> > we check which watermark registers actually need to be changed. And
> > finally we write the watermarks registers in the correct order.
> 
> Yet you do not justify doing so from interrupt context. A simple way
> would be to set safe WM (min of current vs next) before the config
> change, then schedule an update outside of interrupt context after the
> vblank.
> 
> I really want an explanation for why doing so in interrupt context is
> the only sane way. Real power numbers vs complexity please.

One problem is that there might not be a safe set of values that satisfy
both current and future constraints. One example could be a "low bpp/no
primary + sprite -> high bpp primary + no sprite" case. I guess we could
just reject such changes, but that feels a bit wrong. It would introduce
a weird restriction that would propably baffle userspace (doing a->c fails
but doing a->b->wait_for_vbl->c works), or we'd need to block for one frame
which is a big no-no these days.

Also we would still need to track how the current hardware state
changes across vblanks, so it would still end up doing some extra
stuff at irq time.

My hope was that the current approach wouldn't turn out to be too
expensive. Whether people consider 5 usec excessive, I don't know.
And maybe we could shave some of that off still.

Not that I'm 100% attached to the current design. We could certainly
try other ways, if people think that's worthwile.

-- 
Ville Syrj?l?
Intel OTC