On Wed, Feb 20, 2013 at 9:06 PM, Paulo Zanoni <przanoni at gmail.com> wrote: > I finally had some time to look at this again. > > 2013/2/9 Daniel Vetter <daniel at ffwll.ch>: >> On Fri, Feb 08, 2013 at 05:35:16PM -0200, Paulo Zanoni wrote: >>> From: Paulo Zanoni <paulo.r.zanoni at intel.com> >>> >>> From the docs: >>> "Only the rising edge of the PCH Display interrupt will cause the >>> North Display IIR (DEIIR) PCH Display Interrupt even bit to be set, >>> so all PCH Display Interrupts, including back to back interrupts, >>> must be cleared before a new PCH Display interrupt can cause DEIIR >>> to be set". >>> >>> The current code works fine because we don't get many interrupts, but >>> if we enable the PCH FIFO underrun interrupts we'll start getting so >>> many interrupts that at some point new PCH interrupts won't cause >>> DEIIR to be set. >>> >>> The initial implementation I tried was to turn the code that checks >>> SDEIIR into a loop, but we can still get interrupts even after the >>> loop is done (and before the irq handler finishes), so we have to >>> either disable the interrupts or mask them. In the end I concluded >>> that just disabling the PCH interrupts is enough, you don't even need >>> the loop, so this is what this patch implements. I've tested it and it >>> passes the 2 "PCH FIFO underrun interrupt storms" I can reproduce: >>> the "ironlake_crtc_disable" case and the "wrong watermarks" case. >>> >>> In other words, here's how to reproduce the problem fixed by this >>> patch: >>> 1 - Enable PCH FIFO underrun interrupts (SERR_INT on SNB+) >>> 2 - Boot the machine >>> 3 - While booting we'll get tons of PCH FIFO underrun interrupts >>> 4 - Plug a new monitor >>> 5 - Run xrandr, notice it won't detect the new monitor >>> 6 - Read SDEIIR and notice it's not 0 while DEIIR is 0 >>> >>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni at intel.com> >> >> This smells fishy. Looking at the code, clearing the SDEIIR before the >> DEIIR looks strange. If we immediately get another south interrupt, it >> will propagate to the DEIIR. > > Unless south interrupts are disabled by SDEIER, which is what my patch does. > >> But since we currently processing south >> interrupts the corresponding bit in DEIIR is set, so when we clear DEIIR >> later on, we kill that south interrupt. And since it's then once lost, >> we'll lose all subsequent ones, too. >> >> So have you tried clearing DEIIR before clearing any of the subordinate >> registers? > > Yes and it doesn't work. I just re-re-tested it today to be sure. Ok, so I guess we just have really funny interrupt handling in our hw. Hidden interrupt queues and fancy irq handling sequences ... So count me convinced, one request for your patch though: Can you please add a small comment to both places where we disable south interrupts to explain what's going on and that this is the only way to make it work? Also please amend the commit message saying that proper ordering of interrupt registers clearing doesn't do what we want (probably due to the 2-deep irq getting in the way I suspect). Thanks, Daniel >> The patch itself is imo wrong, since afaik the hw will drop any interrupts >> if we disable them. Hence any south interrupt happening while we are in >> the irq handler will effectively be lost. > > They won't be lost since I don't mask the interrupts (IMR), I just > disable them (IER), so they are still listed in the SDEIIR register. > What happens is that in the patch I sent, we'll just re-enable the > south interrupts after we re-enable the north interrupts (DEIER it > 31), so exactly after this, if we have pending interrupts stored, > we'll just get another interrupt, then our interrupt handler will run > and process the "missed" interrupt (IIR can queue up to two interrupt > events; when the IIR is cleared, it will set itself again after one > clock if a second event was stored). > > This can be easily verified with the following: > - Apply patch 6 ("drm/915: print PCH poison interrupts") without this > one (it enables SERR_INT, which will trigger the bug) > - Boot SVB with LVDS only > - While booting, the mode sets will trigger a lot of interrupts, which > will trigger the bug solved by this patch > - Plug a new monitor > - Run xrandr, notice the monitor will be listed as "disconnected" > - Dump the interrupt registers, notice that SDEIIR has some bits set > but DEIIR is 0x0 > - Disable south interrupts by writing 0 SDEIER (use intel_reg_write) > - Read SDEIIR and write the same value back (notice it won't become 0 > due to the stored interrupt events) > - Re-enable SDEIER by writing the old value back to it > - Since SDEIIR is not 0 (due to second interrupts stored), this will > generate an interrupt, which will be processed by the Kernel and call > our irq handler (you could put some print messages on the interrupt > handler just to check if you want) > - Check all the interrupt registers, notice everything is fine (SDEIIR > is 0 and DEIIR is 0) > - Run xrandr and the plugged monitor will finally be detected (i.e., > south interrupts are working again) > >> >> And for the underrun reporting irq storm, I guess we need to first have >> infrastructure to block out interrupts while the pipe is going down before >> we can enable them. Assuming that pipe underruns are expected in that >> case. > > And before all the things you've mentioned, we need this patch to > properly clear the interrupt registers so we can deal with them even > if they come too fast. > > I plan to resend a new interrupt series maybe in the next days. > > Thanks for the review, > Paulo > >> -Daniel >> >>> --- >>> drivers/gpu/drm/i915/i915_irq.c | 16 ++++++++++++++-- >>> 1 file changed, 14 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c >>> index f096ad9..500fd65 100644 >>> --- a/drivers/gpu/drm/i915/i915_irq.c >>> +++ b/drivers/gpu/drm/i915/i915_irq.c >>> @@ -701,7 +701,7 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg) >>> { >>> struct drm_device *dev = (struct drm_device *) arg; >>> drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private; >>> - u32 de_iir, gt_iir, de_ier, pm_iir; >>> + u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier; >>> irqreturn_t ret = IRQ_NONE; >>> int i; >>> >>> @@ -711,6 +711,10 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg) >>> de_ier = I915_READ(DEIER); >>> I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL); >>> >>> + sde_ier = I915_READ(SDEIER); >>> + I915_WRITE(SDEIER, 0); >>> + POSTING_READ(SDEIER); >>> + >>> gt_iir = I915_READ(GTIIR); >>> if (gt_iir) { >>> snb_gt_irq_handler(dev, dev_priv, gt_iir); >>> @@ -759,6 +763,8 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg) >>> >>> I915_WRITE(DEIER, de_ier); >>> POSTING_READ(DEIER); >>> + I915_WRITE(SDEIER, sde_ier); >>> + POSTING_READ(SDEIER); >>> >>> return ret; >>> } >>> @@ -778,7 +784,7 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg) >>> struct drm_device *dev = (struct drm_device *) arg; >>> drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private; >>> int ret = IRQ_NONE; >>> - u32 de_iir, gt_iir, de_ier, pm_iir; >>> + u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier; >>> >>> atomic_inc(&dev_priv->irq_received); >>> >>> @@ -787,6 +793,10 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg) >>> I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL); >>> POSTING_READ(DEIER); >>> >>> + sde_ier = I915_READ(SDEIER); >>> + I915_WRITE(SDEIER, 0); >>> + POSTING_READ(SDEIER); >>> + >>> de_iir = I915_READ(DEIIR); >>> gt_iir = I915_READ(GTIIR); >>> pm_iir = I915_READ(GEN6_PMIIR); >>> @@ -849,6 +859,8 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg) >>> done: >>> I915_WRITE(DEIER, de_ier); >>> POSTING_READ(DEIER); >>> + I915_WRITE(SDEIER, sde_ier); >>> + POSTING_READ(SDEIER); >>> >>> return ret; >>> } >>> -- >>> 1.7.10.4 >>> >>> _______________________________________________ >>> Intel-gfx mailing list >>> Intel-gfx at lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx >> >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> +41 (0) 79 365 57 48 - http://blog.ffwll.ch > > > > -- > Paulo Zanoni -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch