Chris, Jeeja already has a patch that fix this issue in bug 100419, this also "happens" to be identical to a patch a sent on an internal mail to Intel audio developers on the 5 of April. But, then they replied they wanted another solution. I guess it is up to audio dev to decide what solution they want. /Marta > -----Original Message----- > From: Intel-gfx [mailto:intel-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx] On Behalf > Of Chris Wilson > Sent: Thursday, May 4, 2017 1:42 PM > To: Takashi Iwai <tiwai@xxxxxxx> > Cc: Koul, Vinod <vinod.koul@xxxxxxxxx>; Kp, Jeeja <jeeja.kp@xxxxxxxxx>; > alsa-devel@xxxxxxxxxxxxxxxx; intel-gfx@xxxxxxxxxxxxxxxxxxxxx; # v4 . 7+ > <stable@xxxxxxxxxxxxxxx> > Subject: Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for > hdac_wait_for_cmd_dmas() timeout > > On Thu, May 04, 2017 at 12:25:26PM +0200, Takashi Iwai wrote: > > On Thu, 04 May 2017 12:18:29 +0200, > > Chris Wilson wrote: > > > > > > hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do > > > not wait forever for stuck hardware. However, it is called from an > > > irq-disabled context which prevents jiffie from advancing and so the > > > loop doesn't terminate if the hardware fails. This can then cause > > > NMI watchdog warnings, such as: > > > > > > NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 > > > Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi > x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul > crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic > ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core > snd_pcm ptp mei_me prime_numbers pps_core mei lpc_ich i2c_hid > i2c_designware_platform i2c_designware_core [last unloaded: i915] > > > irq event stamp: 13366 > > > hardirqs last enabled at (13365): [<ffffffff81891a87>] > _raw_spin_unlock_irq+0x27/0x50 > > > hardirqs last disabled at (13366): [<ffffffff818918d2>] > _raw_spin_lock_irq+0x12/0x50 > > > softirqs last enabled at (12744): [<ffffffff81085c79>] > __do_softirq+0x1d9/0x4c0 > > > softirqs last disabled at (12721): [<ffffffff810860d9>] irq_exit+0xa9/0xc0 > > > CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U 4.11.0-rc4- > CI-CI_DRM_319+ #1 > > > Hardware name: /NUC5i5RYB, BIOS > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 > > > Workqueue: events_unbound async_run_entry_fn > > > task: ffff88024cd32740 task.stack: ffffc9000162c000 > > > RIP: 0010:preempt_count_add+0xe/0xc0 > > > RSP: 0018:ffffc9000162fbd8 EFLAGS: 00000082 > > > RAX: 0000000080000001 RBX: 0000000704b96558 RCX: > 0000000000000002 > > > RDX: 0000000000000000 RSI: ffffffff81c74f2d RDI: 0000000000000001 > > > RBP: ffffc9000162fc08 R08: 00000000bbcc90cc R09: 23c7b07100000000 > > > R10: ffffffff827901a8 R11: ffff88024cd32740 R12: 0000000704b92baa > > > R13: 0000000000003ea0 R14: 0000000000000003 R15: ffffffffa00061f0 > > > FS: 0000000000000000(0000) GS:ffff880256d80000(0000) > knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00007f90f84a5144 CR3: 0000000003e0f000 CR4: 00000000003406e0 > > > Call Trace: > > > ? delay_tsc+0x3d/0xc0 > > > __delay+0xa/0x10 > > > __const_udelay+0x31/0x40 > > > snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core] > > > ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel] > > > snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core] > > > azx_stop_chip+0x9/0x10 [snd_hda_codec] > > > azx_suspend+0x72/0x220 [snd_hda_intel] > > > pci_pm_suspend+0x71/0x140 > > > dpm_run_callback+0x6f/0x330 > > > ? pci_pm_freeze+0xe0/0xe0 > > > __device_suspend+0xf9/0x370 > > > ? dpm_watchdog_set+0x60/0x60 > > > async_suspend+0x1a/0x90 > > > async_run_entry_fn+0x34/0x160 > > > process_one_work+0x1f4/0x6d0 > > > ? process_one_work+0x16e/0x6d0 > > > worker_thread+0x49/0x4a0 > > > kthread+0x107/0x140 > > > ? process_one_work+0x6d0/0x6d0 > > > ? kthread_create_on_node+0x40/0x40 > > > ret_from_fork+0x2e/0x40 > > > > > > Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to > > > set") > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419 > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Cc: Jeeja KP <jeeja.kp@xxxxxxxxx> > > > Cc: Vinod Koul <vinod.koul@xxxxxxxxx> > > > Cc: Takashi Iwai <tiwai@xxxxxxx> > > > Cc: <stable@xxxxxxxxxxxxxxx> # v4.7+ > > > > Any reason to submit a different fix from what's attached in the > > bugzilla you mentioned? > > Because I didn't see it when Marta complained on irc and suggested > reverting 38b19ed7f81e. There's no advantage either way, but even after > fixing the timeout detection we are still left with the issue that the hw is > stuck and suffer a 200ms suspend delay. :| -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx