On Thu, 2023-06-08 at 13:54 -0500, Mike Christie wrote: > On 6/8/23 9:12 AM, Bart Van Assche wrote: > > On 6/7/23 22:44, Christoph Hellwig wrote: > > > > > Thanks. This wasn't obvious to me from the current code. I'll > > > > > add a > > > > > comment in the next version. > > > > > > > > The crucial question is now, is it sufficient to call > > > > blk_mq_quiesce_queue_nowait() under the mutex, or does the call > > > > to > > > > blk_mq_wait_quiesce_done() have to be under the mutex, too? > > > > The latter would actually kill off our attempt to fix the delay > > > > in fc_remote_port_delete() that was caused by repeated > > > > synchronize_rcu() calls. > > > > > > > > But if I understand you correctly, moving the wait out of the > > > > mutex > > > > should be ok. I'll update the series accordingly. > > > > > > I can't think of a reason we'd want to lock over the wait, but > > > Bart > > > knows this code way better than I do. > > > > Unless deep changes would be made in the block layer > > blk_mq_quiesce_queue_nowait() and/or blk_mq_wait_quiesce_done() > > functions, moving blk_mq_wait_quiesce_done() outside the critical > > section seems fine to me. > > If it helps, I tested with iscsi and the mutex changes discussed > above > and it worked ok for me. I guess the race that Bart was hinting at is hard to trigger. I would like to remark that the fact that we need to hold the SCSI state_mutex while calling blk_mq_quiesce_queue_nowait() looks like a layering issue to me. Not sure if, and how, this could be avoided, though. Regards Martin