Hi,
在 2024/08/15 4:22, Jens Axboe 写道:
On 8/14/24 1:52 PM, Bart Van Assche wrote:
On 8/14/24 4:35 AM, Li Lingfeng wrote:
When interrupt is turned on while a lock holding by spin_lock_irq it
throws a warning because of potential deadlock.
Which tool reported the warning? Please mention this in the patch
description.
blk_mq_mark_tag_wait
spin_lock_irq(&wq->lock)
--> turn off interrupt and get lockA
blk_mq_get_driver_tag
__blk_mq_tag_busy
spin_lock_irq(&tags->lock)
spin_unlock_irq(&tags->lock)
--> release lockB and turn on interrupt accidentally
This looks correct, however, many details are hidden:
t1: IO dispatch
blk_mq_prep_dispatch_rq
blk_mq_get_driver_tag
__blk_mq_get_driver_tag
__blk_mq_alloc_driver_tag
blk_mq_tag_busy -> tag is already busy
// failed to get driver tag
blk_mq_mark_tag_wait
spin_lock_irq(&wq->lock) -> lock A
__add_wait_queue(wq, wait) -> wait queue active
blk_mq_get_driver_tag
__blk_mq_tag_busy
-> 1) tag must be idle, which means there can't be inflight IO
spin_lock_irq(&tags->lock) -> lock B
spin_unlock_irq(&tags->lock) -> unlock B
-> 2) context must be preempt by IO interrupt to trigger deadlock.
So, the deadlock is not possible in theory, there can't be inflight IO
if __blk_mq_tag_busy what to hold the second lock, while deadlock
require IO to be done.
Any way, the change looks good to me.
Thanks,
Kuai
The above call chain does not match the code in Linus' master tree.
Please fix this.
Fix it by using spin_lock_irqsave to get lockB instead of spin_lock_irq.
Fixes: 4f1731df60f9 ("blk-mq: fix potential io hang by wrong 'wake_batch'")
Signed-off-by: Li Lingfeng <lilingfeng3@xxxxxxxxxx>
Please leave a blank line between the patch description and the section
with tags.
Please just include the actual lockdep trace rather than a doctored up
one, it's a lot more descriptive. And use the real lock names rather
than turn it into hypotheticals.