Re: [PATCH] drm/i915: workaround bad DSL readout v2

Daniel Vetter <daniel@xxxxxxxx> · Mon, 14 Sep 2015 16:29:44 +0200



On Mon, Sep 14, 2015 at 04:02:44PM +0300, Ville Syrjälä wrote:
> On Mon, Sep 14, 2015 at 11:10:04AM +0200, Daniel Vetter wrote:
> > On Fri, Sep 11, 2015 at 01:33:08AM +0300, Ville Syrjälä wrote:
> > > On Thu, Sep 10, 2015 at 02:57:32PM -0700, Jesse Barnes wrote:
> > > > On 09/10/2015 02:53 PM, Ville Syrjälä wrote:
> > > > > On Thu, Sep 10, 2015 at 02:38:53PM -0700, Jesse Barnes wrote:
> > > > >> On HSW at least (still testing other platforms, but should be harmless
> > > > >> elsewhere), the DSL reg reads back as 0 when read around vblank start
> > > > >> time.  This ends up confusing the atomic start/end checking code, since
> > > > >> it causes the update to appear as if it crossed a frame count boundary.
> > > > >> Avoid the problem by making sure we don't return scanline_offset from
> > > > >> the get_crtc_scanline function.  In moving the code there, I add to add
> > > > >> an additional delay since it could be called and have a legitimate 0
> > > > >> result for some time (depending on the pixel clock).
> > > > >>
> > > > >> v2: move hsw dsl read hack to get_crtc_scanline (Ville)
> > > > >>
> > > > >> References: https://bugs.freedesktop.org/show_bug.cgi?id=91579
> > > > >> Signed-off-by: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
> > > > >> ---
> > > > >>  drivers/gpu/drm/i915/i915_irq.c | 21 +++++++++++++++++++++
> > > > >>  1 file changed, 21 insertions(+)
> > > > >>
> > > > >> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > > >> index 90bc6c2..97e5d52 100644
> > > > >> --- a/drivers/gpu/drm/i915/i915_irq.c
> > > > >> +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > > >> @@ -697,6 +697,27 @@ static int __intel_get_crtc_scanline(struct intel_crtc *crtc)
> > > > >>  		position = __raw_i915_read32(dev_priv, PIPEDSL(pipe)) & DSL_LINEMASK_GEN3;
> > > > >>  
> > > > >>  	/*
> > > > >> +	 * On HSW, the DSL reg (0x70000) appears to return 0 if we
> > > > >> +	 * read it right around the start of vblank.  So try it again
> > > > >> +	 * so we don't accidentally end up spanning a vblank frame
> > > > >> +	 * increment, causing the pipe_update_end() code to squak at us.
> > > > >> +	 */
> > > > >> +	if (IS_HASWELL(dev) && !position) {
> > > > >> +		int i, temp;
> > > > >> +
> > > > >> +		for (i = 0; i < 100; i++) {
> > > > >> +			udelay(1);
> > > > >> +			temp = __raw_i915_read32(dev_priv, PIPEDSL(pipe)) &
> > > > >> +				DSL_LINEMASK_GEN3;
> > > > >> +			if (temp != position) {
> > > > >> +				position = temp;
> > > > >> +				goto out;
> > > > >> +			}
> > > > >> +		}
> > > > >> +	}
> > > > > 
> > > > > Hmm. Another idea. If it always happens at start of vbl, maybe have a
> > > > > look at the ISR. If scanline reads 0, but ISR says we're in vblank, just
> > > > > return vblank_start. That's assming ISR would in fact show that it's in
> > > > > vblank when this happens.
> > > > 
> > > > Sounds a bit racy,
> > > 
> > > Not too much I think. We're under spinlock_irqsave here, and if the
> > > scanline should magically start to tick just after we've read it we
> > > would still give almost the right answer, even if we got magically
> > > delayed by one or two scanlines. That's assuming that it has already
> > > latched the double buffered registers when it goes wonky. If the
> > > register latching happens only when the counter starts working again,
> > > well then we're in deeper doodoo. But then I wouldn't expect ISR to
> > > indicate that we're in vblank then either. So I think you should
> > > definitely check what the ISR actually tells you while things are
> > > in the bad state.
> > 
> > There's still a race
> > 1. we read bogus vblank counter
> > 2. vblank processing completes in hw
> > 3. ISR indicates where out of vblank.
> > 
> > So you'd need at least a loop to make sure you're 0 vblank counter is
> > stable across the ISR read.
> 
> Seems rather unlikely. That would mean something blocked us for the entire
> duration of the vblank.
> 
> > -Daniel
> > 
> > > In case the badness happens just before start of vblank, well, then
> > > I have no good idea how to handle it. This sort of retry loop is
> > > the best that comes to mind right now :(
> > > 
> > > > though I guess it wouldn't hurt.  I'm actually a
> > > > little dubious about this change anyway since it will mean longer delays
> > > > when we really are at the top of the field/frame.  You sure you don't
> > > > like the first one better?
> > > 
> > > The first one can't help vblank timestamps. Although I suppose that if
> > > the badness always cures itself before the interrupt gets raised, we
> > > would not hit this normally when we update the timestamp from the irq
> > > handler. We could hit it when updating the timestamp outside the irq
> > > handler though (ie. during vblank_get/put etc.).
> > 
> > On top of that there's patches floating from Chris to get a vblank
> > timestamp without grabbing the interrupts temporarily or any spinlock.
> > That helps when we immediately disable the interrupt and then a bit later
> > the compositors asks for the current vblank to estimate the next frame.
> 
> I think we all already agreed that it's not going to fly. If we don't
> ask the hw how many vblanks we lost we can't give userspace any kind
> of accurate information.
> 
> > So if possible I'd definitely vote to share this "fix up bonghits vblank"
> > logic if possible ...
> 
> I have no idea what you're looking to share and with whom. I do not want
> workarounds for hardware specific bugs in the shared vblank code. We
> already had that due to radeon, and that stuff has been doing nothing
> for us except hinder attempts at making the code work sanely for sane
> Intel hardware.
> 
> Which reminds me, I need to post a pile of vblank patches...

s/vblank/scanline/ in all of the above since I was confused again. And I
just thought if we have some special hacks to make the get_scanline
function cope with hw oddity that should be used by the vblank time stamp
code too. But that's the case already (if we put the hack into
__intel_get_crtc_scanline).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx