Hi, Thank you for your posting the patch, however I have some nitpickings (and some requests to you). On Thu, Jul 18, 2024 at 11:56:54AM +0000, Edmund Raile wrote: > Commit b5b519965c4c ("ALSA: firewire-lib: operate for period elapse event > in process context") removed the process context workqueue from > amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove > its overhead. > > With RME Fireface 800, this lead to a regression since Kernels 5.14.0, > causing a deadlock with eventual system freeze under ALSA operation: > > ? tasklet_unlock_spin_wait > </NMI> > <TASK> > ohci_flush_iso_completions firewire_ohci > amdtp_domain_stream_pcm_pointer snd_firewire_lib > snd_pcm_update_hw_ptr0 snd_pcm > snd_pcm_status64 snd_pcm > > ? native_queued_spin_lock_slowpath > </NMI> > <IRQ> > _raw_spin_lock_irqsave > snd_pcm_period_elapsed snd_pcm > process_rx_packets snd_firewire_lib > irq_target_callback snd_firewire_lib > handle_it_packet firewire_ohci > context_tasklet firewire_ohci > > Restore the work queue to prevent deadlock between ALSA substream > process context spin_lock of snd_pcm_stream_lock_irq() in snd_pcm_status64() > and OHCI 1394 IT softIRQ context spin_lock of snd_pcm_stream_lock_irqsave() > in snd_pcm_period_elapsed(). > > to reproduce the issue: > direct ALSA playback to the device: > mpv --audio-device=alsa/sysdefault:CARD=Fireface800 Spor-Ignition.flac > Time to occurrence: 2s to 30m > Likelihood increased by: > - high CPU load > stress --cpu $(nproc) > - switching between applications via workspaces > tested with i915 in Xfce > PulsaAudio / PipeWire conceal the issue as they run PCM substream > without period wakeup mode, issuing less hardIRQs. > > Closes: https://lore.kernel.org/regressions/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simnl@jn3eo7pe642q/T/#u > Fixes: 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") > Signed-off-by: Edmund Raile <edmund.raile@xxxxxxxxx> > --- > This is the follow-up patch to the 5.14.0 regression I reported: > https://lore.kernel.org/regressions/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simnl@jn3eo7pe642q/T/#u > ("[REGRESSION] ALSA: firewire-lib: snd_pcm_period_elapsed deadlock with Fireface 800") > > Takashi Sakamoto explained the issue in his response to the regression: > A. In the process context > * (lock A) Acquiring spin_lock by snd_pcm_stream_lock_irq() in > snd_pcm_status64() > * (lock B) Then attempt to enter tasklet > > B. In the softIRQ context > * (lock B) Enter tasklet > * (lock A) Attempt to acquire spin_lock by snd_pcm_stream_lock_irqsave() in > snd_pcm_period_elapsed() > > This leads me to believe this isn't just an issue limited to the RME Fireface We are in the merge window to Linux kernel 6.11[1]. I prefer reviewing a small and trivial patches in the weeks, thus I would like to postpone applying this kind of patches after releasing -rc1 even if they look good. In Linux kernel development, we have the process to apply patches into released kernels to fix bugs and regressions. As of today, the developers pay some of their efforts to maintain the following kernels[2]: * 6.10 * 6.9.10 * 6.6.41 * 6.1.100 * 5.15.163 * 5.10.222 * 5.4.280 * 4.19.318 It is cleared that the issued changes were applied to v5.14 kernel, thus we should lead the developers to find the posted patches and apply them to future releases of each version. The process[3] have come into existence enough before the regression report procedure[4]. I would like you to read some documentation about the process and add more care for stable kernel maintainers. Well, the issued commits are (older at first): * 7ba5ca32fe6 * b5b519965c4 As long as I can see, these commits can be reverted per each, with a slight handy-changes. In the case, it is preferable to make a patchset including these two revert patches. I would like to request it to you, instead of the all-in-one patch, so that developers easily find the issued commits (and work to apply these patches into kernels maintained publicly/locally). At last, I prefer that the whole patch comment is written by the posters, instead of referring to comments by the others. I know that the description about AB/BA deadlock is a bit hard to write, but it is enough and satisfied for you to write what you understand about the issue. I'd accept it. [1] https://lore.kernel.org/lkml/CAHk-=wjV_O2g_K19McjGKrxFxMFDqex+fyGcKc3uac1ft_O2gg@xxxxxxxxxxxxxx/ [2] https://kernel.org/ [3] https://docs.kernel.org/process/stable-kernel-rules.html [4] https://docs.kernel.org/process/handling-regressions.html Thanks Takashi Sakamoto