> Il giorno 02 mar 2017, alle ore 19:25, Jens Axboe <axboe@xxxxxx> ha scritto: > > On 03/02/2017 11:07 AM, Paolo Valente wrote: >> >>> Il giorno 02 mar 2017, alle ore 17:13, Jens Axboe <axboe@xxxxxx> ha scritto: >>> >>> On 03/02/2017 09:07 AM, Paolo Valente wrote: >>>> >>>>> Il giorno 02 mar 2017, alle ore 16:13, Jens Axboe <axboe@xxxxxx> ha scritto: >>>>> >>>>> On 03/02/2017 08:00 AM, Jens Axboe wrote: >>>>>> On 03/02/2017 03:28 AM, Paolo Valente wrote: >>>>>>> >>>>>>>> Il giorno 15 feb 2017, alle ore 19:04, Jens Axboe <axboe@xxxxxx> ha scritto: >>>>>>>> >>>>>>>> On 02/15/2017 10:58 AM, Jens Axboe wrote: >>>>>>>>> On 02/15/2017 10:24 AM, Paolo Valente wrote: >>>>>>>>>> >>>>>>>>>>> Il giorno 10 feb 2017, alle ore 19:32, Omar Sandoval <osandov@xxxxxxxxxxx> ha scritto: >>>>>>>>>>> >>>>>>>>>>> From: Omar Sandoval <osandov@xxxxxx> >>>>>>>>>>> >>>>>>>>>>> None of the other blk-mq elevator hooks are called with this lock held. >>>>>>>>>>> Additionally, it can lead to circular locking dependencies between >>>>>>>>>>> queue_lock and the private scheduler lock. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Omar, >>>>>>>>>> I'm sorry but it seems that a new potential deadlock has showed up. >>>>>>>>>> See lockdep splat below. >>>>>>>>>> >>>>>>>>>> I've tried to think about different solutions than turning back to >>>>>>>>>> deferring the body of exit_icq, but at no avail. >>>>>>>>> >>>>>>>>> Looks like a interaction between bfqd->lock and q->queue_lock. Since the >>>>>>>>> core has no notion of you bfqd->lock, the naturally dependency here >>>>>>>>> would be to nest bfqd->lock inside q->queue_lock. Is that possible for >>>>>>>>> you? >>>>>>>>> >>>>>>>>> Looking at the code a bit, maybe it'd just be simpler to get rid of >>>>>>>>> holding the queue lock for that spot. For the mq scheduler, we really >>>>>>>>> don't want places where we invoke with that held already. Does the below >>>>>>>>> work for you? >>>>>>>> >>>>>>>> Would need to remove one more lockdep assert. And only test this for >>>>>>>> the mq parts, we'd need to spread a bit of love on the classic >>>>>>>> scheduling icq exit path for this to work on that side. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> Jens, >>>>>>> here is the reply I anticipated in my previous email: after rebasing >>>>>>> against master, I'm getting again the deadlock that this patch of >>>>>>> yours solved (together with some changes in bfq-mq). I thought you added a >>>>>>> sort of equivalent commit (now) to the mainline branch. Am I wrong? >>>>>> >>>>>> The patch I posted was never pulled to completion, it wasn't clear >>>>>> to me if it fixed your issue or not. Maybe I missed a reply on >>>>>> that? >>>>>> >>>>>> Let me take another stab at it today, I'll send you a version to test >>>>>> on top of my for-linus branch. >>>>> >>>>> I took at look at my last posting, and I think it was only missing a >>>>> lock grab in cfq, since we don't hold that for icq exit anymore. Paolo, >>>>> it'd be great if you could verify that this works. Not for bfq-mq (we >>>>> already know it does), but for CFQ. >>>> >>>> No luck, sorry :( >>>> >>>> It looks like to have just not to take q->queue_lock in cfq_exit_icq. >>> >>> I was worried about that. How about the below? We need to grab the lock >>> at some point for legacy scheduling, but the ordering should be correct. >>> >>> >> >> It seems to be: no more crashes or lockdep complains after several tests, boots and shutdowns :) > > Great! Can I add your Tested-by? Sure! Thanks, Paolo > > -- > Jens Axboe