Re: [PATCH 1/3] drm/i915/execlists: Remove synchronize_hardirq()

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Mon, 4 Jun 2018 16:13:54 +0100

On 04/06/2018 08:34, Chris Wilson wrote:
This looked to be a sensible precaution to flush an ongoing interrupt
before resetting the IRQ. However, it has a potential to trigger a lockup
and is not strictly required so drop it.

[21411.603173] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
[21411.603174] Modules linked in: i915 intel_gtt drm_kms_helper
[21411.603181] CPU: 1 PID: 672 Comm: kworker/1:2 Tainted: G     U            4.17.0-rc7+ #307
[21411.603181] Hardware name:  /NUC6CAYB, BIOS AYAPLCEL.86A.0029.2016.1124.1625 11/24/2016
[21411.603227] Workqueue: events_long i915_hangcheck_elapsed [i915]
[21411.603234] RIP: 0010:__synchronize_hardirq+0x12/0x50
[21411.603235] RSP: 0018:ffffc90000393c48 EFLAGS: 00000006
[21411.603236] RAX: 0000000001440200 RBX: ffff880275b09200 RCX: 000000000000003d
[21411.603237] RDX: 0000000000000000 RSI: 000000000000007d RDI: ffff880275b09200
[21411.603238] RBP: ffff880256e81440 R08: ffff88027647a910 R09: ffff880275b09200
[21411.603239] R10: 0000000000000000 R11: 0000000000000040 R12: ffff880275b092a4
[21411.603239] R13: 0000000000000286 R14: ffff880273a86000 R15: ffff88027004eac0
[21411.603241] FS:  0000000000000000(0000) GS:ffff88027fc80000(0000) knlGS:0000000000000000
[21411.603242] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[21411.603243] CR2: 00007f9cc24d6018 CR3: 0000000002a09000 CR4: 00000000001406e0
[21411.603243] Call Trace:
[21411.603250]  synchronize_hardirq+0x1b/0x30
[21411.603286]  reset_irq+0x1f/0x180 [i915]
[21411.603323]  execlists_reset+0x2f/0x1d0 [i915]
[21411.603351]  i915_reset_engine+0xf0/0x140 [i915]
[21411.603381]  i915_handle_error+0x11d/0x420 [i915]
[21411.603386]  ? scnprintf+0x3a/0x70
[21411.603421]  i915_hangcheck_elapsed+0x310/0x490 [i915]
[21411.603425]  process_one_work+0x195/0x310
[21411.603427]  worker_thread+0x26/0x3c0
[21411.603429]  ? wq_nice_store+0xd0/0xd0
[21411.603431]  kthread+0x107/0x120
[21411.603433]  ? _kthread_create_worker_on_cpu+0x40/0x40
[21411.603435]  ret_from_fork+0x1f/0x30
[21411.603437] Code: 00 00 89 c5 e8 50 e1 41 00 48 8b 53 38 89 e8 81 22 ff ff fb ff 5b 5d c3 90 41 54 4c 8d a7 a4 00 00 00 55 53 48 89 fb eb 02 f3 90 <48> 8b 43 38 8b 00 a9 00 00 04 00 75 f1 4c 89 e7 e8 89 e0 41 00

Fixes: 46b3617dfec8 ("drm/i915: Actually flush interrupts on reset not just wedging")
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
Cc: Michel Thierry <michel.thierry@xxxxxxxxx>
Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx>
---
  drivers/gpu/drm/i915/intel_lrc.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 517e92c6a70b..8d912d0c8fc1 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -869,7 +869,6 @@ static void reset_irq(struct intel_engine_cs *engine)
  {
  	/* Mark all CS interrupts as complete */
  	smp_store_mb(engine->execlists.active, 0);
-	synchronize_hardirq(engine->i915->drm.irq);
  
  	clear_gtiir(engine);
  


Funny, I was just commenting on synchronize_hardirq in the direct 
submission series.

In this one I don't immediately see what deadlocks. What's on the other 
side, I mean what is preventing interrupt handler from exiting?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx