On 24.4.2023. 19:27, Johannes Berg wrote:
On Sun, 2023-04-23 at 10:24 +0200, Mirsad Goran Todorovac wrote:
In the function ieee80211_tx_dequeue() there is a locking sequence:
begin:
spin_lock(&local->queue_stop_reason_lock);
q_stopped = local->queue_stop_reasons[q];
spin_unlock(&local->queue_stop_reason_lock);
However small the chance (increased by ftracetest), an asynchronous
interrupt can occur in between of spin_lock() and spin_unlock(),
and the interrupt routine will attempt to lock the same
&local->queue_stop_reason_lock again.
This is the only remaining spin_lock() on local->queue_stop_reason_lock
that did not disable interrupts and could have possibly caused the deadlock
on the same CPU (core).
This will cause a costly reset of the CPU and wifi device or an
altogether hang in the single CPU and single core scenario.
This is the probable reproduce of the deadlock:
Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: Possible unsafe locking scenario:
Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: CPU0
Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: ----
Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock);
Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: <Interrupt>
Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock);
Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel:
*** DEADLOCK ***
Fixes: 4444bc2116ae
That fixes tag is wrong, should be
Fixes: 4444bc2116ae ("wifi: mac80211: Proper mark iTXQs for resumption")
Otherwise seems fine to me, submit it properly?
johannes
Will do, Sir. Do I have an Acked-by: ?
Thank you.
Mirsad
--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu