Re: [PATCH] blk-mq-sched: don't hold queue_lock when calling exit_icq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/02/2017 11:07 AM, Paolo Valente wrote:
> 
>> Il giorno 02 mar 2017, alle ore 17:13, Jens Axboe <axboe@xxxxxx> ha scritto:
>>
>> On 03/02/2017 09:07 AM, Paolo Valente wrote:
>>>
>>>> Il giorno 02 mar 2017, alle ore 16:13, Jens Axboe <axboe@xxxxxx> ha scritto:
>>>>
>>>> On 03/02/2017 08:00 AM, Jens Axboe wrote:
>>>>> On 03/02/2017 03:28 AM, Paolo Valente wrote:
>>>>>>
>>>>>>> Il giorno 15 feb 2017, alle ore 19:04, Jens Axboe <axboe@xxxxxx> ha scritto:
>>>>>>>
>>>>>>> On 02/15/2017 10:58 AM, Jens Axboe wrote:
>>>>>>>> On 02/15/2017 10:24 AM, Paolo Valente wrote:
>>>>>>>>>
>>>>>>>>>> Il giorno 10 feb 2017, alle ore 19:32, Omar Sandoval <osandov@xxxxxxxxxxx> ha scritto:
>>>>>>>>>>
>>>>>>>>>> From: Omar Sandoval <osandov@xxxxxx>
>>>>>>>>>>
>>>>>>>>>> None of the other blk-mq elevator hooks are called with this lock held.
>>>>>>>>>> Additionally, it can lead to circular locking dependencies between
>>>>>>>>>> queue_lock and the private scheduler lock.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Omar,
>>>>>>>>> I'm sorry but it seems that a new potential deadlock has showed up.
>>>>>>>>> See lockdep splat below.
>>>>>>>>>
>>>>>>>>> I've tried to think about different solutions than turning back to
>>>>>>>>> deferring the body of exit_icq, but at no avail.
>>>>>>>>
>>>>>>>> Looks like a interaction between bfqd->lock and q->queue_lock. Since the
>>>>>>>> core has no notion of you bfqd->lock, the naturally dependency here
>>>>>>>> would be to nest bfqd->lock inside q->queue_lock. Is that possible for
>>>>>>>> you?
>>>>>>>>
>>>>>>>> Looking at the code a bit, maybe it'd just be simpler to get rid of
>>>>>>>> holding the queue lock for that spot. For the mq scheduler, we really
>>>>>>>> don't want places where we invoke with that held already. Does the below
>>>>>>>> work for you?
>>>>>>>
>>>>>>> Would need to remove one more lockdep assert. And only test this for
>>>>>>> the mq parts, we'd need to spread a bit of love on the classic
>>>>>>> scheduling icq exit path for this to work on that side.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Jens,
>>>>>> here is the reply I anticipated in my previous email: after rebasing
>>>>>> against master, I'm getting again the deadlock that this patch of
>>>>>> yours solved (together with some changes in bfq-mq).  I thought you added a
>>>>>> sort of equivalent commit (now) to the mainline branch.  Am I wrong?
>>>>>
>>>>> The patch I posted was never pulled to completion, it wasn't clear
>>>>> to me if it fixed your issue or not. Maybe I missed a reply on
>>>>> that?
>>>>>
>>>>> Let me take another stab at it today, I'll send you a version to test
>>>>> on top of my for-linus branch.
>>>>
>>>> I took at look at my last posting, and I think it was only missing a
>>>> lock grab in cfq, since we don't hold that for icq exit anymore.  Paolo,
>>>> it'd be great if you could verify that this works. Not for bfq-mq (we
>>>> already know it does), but for CFQ.
>>>
>>> No luck, sorry :(
>>>
>>> It looks like to have just not to take q->queue_lock in cfq_exit_icq.
>>
>> I was worried about that. How about the below? We need to grab the lock
>> at some point for legacy scheduling, but the ordering should be correct.
>>
>>
> 
> It seems to be: no more crashes or lockdep complains after several tests, boots and shutdowns :)

Great! Can I add your Tested-by?

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux