Re: [RESEND PATCH] blk-mq: fix hang caused by freeze/unfreeze sequence

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/9/19 5:29 PM, Jinpu Wang wrote:
> Bob Liu <bob.liu@xxxxxxxxxx> 于2019年4月9日周二 上午11:11写道:
>>
>> This patch was proposed by Roman Pen[3] years ago.
>> Recently we hit a bug which is likely caused by the same reason,so rebased his
>> fix to v5.1 and resend.
>> Below is almost copied from that patch[3].
>>
>> ------
>> Long time ago there was a similar fix proposed by Akinobu Mita[1],
>> but it seems that time everyone decided to fix this subtle race in
>> percpu-refcount and Tejun Heo[2] did an attempt (as I can see that
>> patchset was not applied).
>>
>> The following is a description of a hang in blk_mq_freeze_queue_wait() -
>> same fix but a bug from another angle.
>>
>> The hang happens on attempt to freeze a queue while another task does
>> queue unfreeze.
>>
>> The root cause is an incorrect sequence of percpu_ref_reinit() and
>> percpu_ref_kill() and as a result those two can be swapped:
>>
>>  CPU#0               CPU#1
>>  ----------------    -----------------
>>  percpu_ref_kill()
>>
>>                      percpu_ref_kill() << atomic reference does
>>  percpu_ref_reinit()                   << not guarantee the order
>>
>>                      blk_mq_freeze_queue_wait() << HANG HERE
>>
>>                      percpu_ref_reinit()
>>
>> Firstly this wrong sequence raises two kernel warnings:
>>
>>   1st. WARNING at lib/percpu-recount.c:309
>>        percpu_ref_kill_and_confirm called more than once
>>
>>   2nd. WARNING at lib/percpu-refcount.c:331
>>
>> But the most unpleasant effect is a hang of a blk_mq_freeze_queue_wait(),
>> which waits for a zero of a q_usage_counter, which never happens
>> because percpu-ref was reinited (instead of being killed) and stays in
>> PERCPU state forever.
>>
>> The simplified sequence above can be reproduced on shared tags, when
>> queue A is going to die meanwhile another queue B is in init state and
>> is trying to freeze the queue A, which shares the same tags set:
>>
>>  CPU#0                           CPU#1
>>  ------------------------------- ------------------------------------
>>  q1 = blk_mq_init_queue(shared_tags)
>>
>>                                 q2 = blk_mq_init_queue(shared_tags):
>>                                   blk_mq_add_queue_tag_set(shared_tags):
>>                                     blk_mq_update_tag_set_depth(shared_tags):
>>                                       blk_mq_freeze_queue(q1)
>>  blk_cleanup_queue(q1)                 ...
>>    blk_mq_freeze_queue(q1)   <<<->>>   blk_mq_unfreeze_queue(q1)
>>
>> [1] Message id: 1443287365-4244-7-git-send-email-akinobu.mita@xxxxxxxxx
>> [2] Message id: 1443563240-29306-6-git-send-email-tj@xxxxxxxxxx
>> [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_9268199_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=OcA07QqFechuCug2pqm_-JpGP_mOt0YouTXApdePMGw&s=VM_-8S5gkFo8zUjT5RoY0CkbxN6hQmTwVmslulwsFJM&e=
>>
>> Signed-off-by: Roman Pen <roman.penyaev@xxxxxxxxxxxxxxxx>
>> Signed-off-by: Bob Liu <bob.liu@xxxxxxxxxx>
>> Cc: Akinobu Mita <akinobu.mita@xxxxxxxxx>
>> Cc: Tejun Heo <tj@xxxxxxxxxx>
>> Cc: Jens Axboe <axboe@xxxxxxxxx>
>> Cc: Christoph Hellwig <hch@xxxxxx>
>> Cc: linux-block@xxxxxxxxxxxxxxx
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>
> 
> Replaced Roman's email address.
> 
> We at 1 & 1 IONOS (former ProfitBricks) have been carried this patch
> for some years,
> it has been running in production for some years too,

Nice to hear that!

> would be good to see it in upstream :)

Yes.
Could anyone have a review? Thanks!

> 
> Thanks,
> 
> Jack Wang
> Linux Kernel Developer @ 1 & 1 IONOS
> 




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux