> Il giorno 07 feb 2017, alle ore 22:45, Omar Sandoval <osandov@xxxxxxxxxxx> ha scritto: > > On Tue, Feb 07, 2017 at 06:33:46PM +0100, Paolo Valente wrote: >> Hi, >> this patch is meant to show that, if the body of the hook exit_icq is executed >> from inside that hook, and not as deferred work, then a circular deadlock >> occurs. >> >> It happens if, on a CPU >> - the body of icq_exit takes the scheduler lock, >> - it does so from inside the exit_icq hook, which is invoked with the queue >> lock held >> >> while, on another CPU >> - bfq_bio_merge, after taking the scheduler lock, invokes bfq_bic_lookup, >> which, in its turn, takes the queue lock. bfq_bic_lookup needs to take such a >> lock, because it invokes ioc_lookup_icq. >> >> For more details, here is a lockdep report, right before the deadlock did occur. >> >> [ 44.059877] ====================================================== >> [ 44.124922] [ INFO: possible circular locking dependency detected ] >> [ 44.125795] 4.10.0-rc5-bfq-mq+ #38 Not tainted >> [ 44.126414] ------------------------------------------------------- >> [ 44.127291] sync/2043 is trying to acquire lock: >> [ 44.128918] (&(&bfqd->lock)->rlock){-.-...}, at: [<ffffffff90484195>] bfq_exit_icq_bfqq+0x55/0x140 >> [ 44.134052] >> [ 44.134052] but task is already holding lock: >> [ 44.134868] (&(&q->__queue_lock)->rlock){-.....}, at: [<ffffffff9044738e>] put_io_context_active+0x6e/0xc0 > > Hey, Paolo, > > I only briefly skimmed the code, but what are you using the queue_lock > for? You should just use your scheduler lock everywhere. blk-mq doesn't > use the queue lock, so the scheduler is the only thing you need mutual > exclusion against. Hi Omar, the cause of the problem is that the hook functions bfq_request_merge and bfq_allow_bio_merge invoke, directly or through other functions, the function bfq_bic_lookup, which, in its turn, invokes ioc_lookup_icq. The latter must be invoked with the queue lock held. In particular the offending lines in bfq_bic_lookup are: spin_lock_irqsave(q->queue_lock, flags); icq = icq_to_bic(ioc_lookup_icq(ioc, q)); spin_unlock_irqrestore(q->queue_lock, flags); Maybe I'm missing something and we can avoid taking this lock? Thanks, Paolo > I'm guessing if you stopped using that, your locking > issues would go away.