Re: [RFC] hard LOCKUP caused by race between blk_init_queue_node and blkcg_print_blkgs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bart,
Thanks very much for the quick response.

On 18/1/31 05:19, Bart Van Assche wrote:
> On Tue, 2018-01-30 at 19:21 +0800, Joseph Qi wrote:
>> Hi Jens and Folks,
>>
>> Recently we've gotten a hard LOCKUP issue. After investigating the issue
>> we've found a race between blk_init_queue_node and blkcg_print_blkgs.
>> The race is described below.
>>
>> blk_init_queue_node                 blkcg_print_blkgs
>>   blk_alloc_queue_node (1)
>>     q->queue_lock = &q->__queue_lock (2)
>>     blkcg_init_queue(q) (3)
>>                                     spin_lock_irq(blkg->q->queue_lock) (4)
>>   q->queue_lock = lock (5)
>>                                     spin_unlock_irq(blkg->q->queue_lock) (6)
>>
>> (1) allocate an uninitialized queue;
>> (2) initialize queue_lock to its default internal lock;
>> (3) initialize blkcg part of request queue, which will create blkg and
>> then insert it to blkg_list;
>> (4) traverse blkg_list and find the created blkg, and then take its
>> queue lock, here it is the default *internal lock*;
>> (5) *race window*, now queue_lock is overridden with *driver specified
>> lock*;
>> (6) now unlock *driver specified lock*, not the locked *internal lock*,
>> unlock balance breaks.
>>
>> For the issue above, I think blkcg_init_queue is a bit earlier. We
>> can't allow a further use before request queue is fully initialized.
>> Since blk_init_queue_node is a really common path and it allows driver
>> to override the default internal lock, I'm afraid several other places
>> may also have the same issue.
>> Am I missing something here?
> 
> What code triggered the blkcg_print_blkgs() call? Was it perhaps a function
> related to throttling? If so, I think there are two possible ways to fix this
> race:
Yes, it is from block throttle.

> 1. Moving the .queue_lock initialization from blk_init_queue_node() into
>    blk_alloc_queue_node() such that it happens before the blkcg_init_queue()
>    call. This however will require a tree-wide change of the blk_alloc_queue()
>    and all blk_alloc_queue_node() calls in all block drivers.
> 2. Splitting the blk_throtl_init() function such that the blkcg_activate_policy()
>    call can occur after the .queue_lock pointer has been initialized, e.g. from
>    inside blk_init_allocated_queue() or from inside blk_register_queue() instead
>    of from inside blkcg_init_queue().
> 
At the first glance, I'm thinking of moving the blkcg_init_queue after
the queue_lock is overridden. But I don't have an idea yet to avoid the
risk during cleanup.
Since the initialization of request queue is so fundamental, I'm not
sure if there is the same risk in several other places.

Thanks,
Joseph

> I'm not sure what the best approach is. Maybe there is even a third approach
> that is better than the two approaches mentioned above.
> 
> Bart.
> 



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux