On Tue, Oct 22, 2024 at 08:18:05AM +0200, Christoph Hellwig wrote: > On Fri, Oct 18, 2024 at 09:35:42AM +0800, Ming Lei wrote: > > Recently we got several deadlock report[1][2][3] caused by blk_mq_freeze_queue > > and blk_enter_queue(). > > > > Turns out the two are just like one rwsem, so model them as rwsem for > > supporting lockdep: > > > > 1) model blk_mq_freeze_queue() as down_write_trylock() > > - it is exclusive lock, so dependency with blk_enter_queue() is covered > > - it is trylock because blk_mq_freeze_queue() are allowed to run concurrently > > Is this using the right terminology? down_write and other locking > primitives obviously can run concurrently, the whole point is to > synchronize the code run inside the criticial section. > > I think what you mean here is blk_mq_freeze_queue can be called more > than once due to a global recursion counter. > > Not sure modelling it as a trylock is the right approach here, > I've added the lockdep maintainers if they have an idea. Yeah, looks we can just call lock_acquire for the outermost freeze/unfreeze. > > > > > 2) model blk_enter_queue() as down_read() > > - it is shared lock, so concurrent blk_enter_queue() are allowed > > - it is read lock, so dependency with blk_mq_freeze_queue() is modeled > > - blk_queue_exit() is often called from other contexts(such as irq), and > > it can't be annotated as rwsem_release(), so simply do it in > > blk_enter_queue(), this way still covered cases as many as possible > > > > NVMe is the only subsystem which may call blk_mq_freeze_queue() and > > blk_mq_unfreeze_queue() from different context, so it is the only > > exception for the modeling. Add one tagset flag to exclude it from > > the lockdep support. > > rwsems have a non_owner variant for these kinds of uses cases, > we should do the same for blk_mq_freeze_queue to annoate the callsite > instead of a global flag. Here it isn't real rwsem, and lockdep doesn't have non_owner variant for rwsem_acquire() and rwsem_release(). Another corner case is blk_mark_disk_dead() in which freeze & unfreeze may be run from different task contexts too. thanks, Ming