There's an entire pile of issues in here: - Use the main RING_HEAD register, not ACTHD. ACTHD points at the gtt offset of the batch buffer when a batch is executed. Semaphores are always emitted to the main ring, so we always want to look at that. - Mask the obtained HEAD pointer with the actual ring size, which is much smaller. Together with the above issue this resulted us in trying to dereference a pointer way outside of the ring mmio mapping. The resulting invalid access in interrupt context (hangcheck is executed from timers) lead to a full blown kernel panic. The fbcon panic handler then tried to frob our driver harder, resulting in a full machine hang at least on my snb here where I've stumbled over this. - Handle ring wrapping correctly and be a bit more explicit about how many dwords we're scanning. We probably should also scan more than just 4 ... - Space out some of teh computations for readability. This prevents hard-hangs on my snb here. Verdict from QA is still pending, but the symptoms match. Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> Cc: Ben Widawsky <ben@xxxxxxxxxxxx> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74100 Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxx> --- drivers/gpu/drm/i915/i915_irq.c | 38 ++++++++++++++++++++++++++------------ 1 file changed, 26 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index be2713f12e08..473372a6c97d 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2529,29 +2529,43 @@ static struct intel_ring_buffer * semaphore_waits_for(struct intel_ring_buffer *ring, u32 *seqno) { struct drm_i915_private *dev_priv = ring->dev->dev_private; - u32 cmd, ipehr, acthd, acthd_min; + u32 cmd, ipehr, head; + int i; ipehr = I915_READ(RING_IPEHR(ring->mmio_base)); if ((ipehr & ~(0x3 << 16)) != (MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE | MI_SEMAPHORE_REGISTER)) return NULL; - /* ACTHD is likely pointing to the dword after the actual command, - * so scan backwards until we find the MBOX. + /* + * HEAD is likely pointing to the dword after the actual command, + * so scan backwards until we find the MBOX. But limit it to just 3 + * dwords. Note that we don't care about ACTHD here since that might + * point at at batch, and semaphores are always emitted into the + * ringbuffer itself. */ - acthd = intel_ring_get_active_head(ring) & HEAD_ADDR; - acthd_min = max((int)acthd - 3 * 4, 0); - do { - cmd = ioread32(ring->virtual_start + acthd); + head = I915_READ_HEAD(ring) & HEAD_ADDR; + + for (i = 4; i; --i) { + /* + * Be paranoid and presume the hw has gone off into the wild - + * our ring is smaller than what the hardware (and hence + * HEAD_ADDR) allows. Also handles wrap-around. + */ + head &= ring->size - 1; + + /* This here seems to blow up */ + cmd = ioread32(ring->virtual_start + head); if (cmd == ipehr) break; - acthd -= 4; - if (acthd < acthd_min) - return NULL; - } while (1); + head -= 4; + } + + if (!i) + return NULL; - *seqno = ioread32(ring->virtual_start+acthd+4)+1; + *seqno = ioread32(ring->virtual_start + head + 4) + 1; return &dev_priv->ring[(ring->id + (((ipehr >> 17) & 1) + 1)) % 3]; } -- 1.8.1.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx