On 1/31/18 12:13 PM, Bart Van Assche wrote: > Initialize the request queue lock earlier such that the following > race can no longer occur: > > blk_init_queue_node blkcg_print_blkgs > blk_alloc_queue_node (1) > q->queue_lock = &q->__queue_lock (2) > blkcg_init_queue(q) (3) > spin_lock_irq(blkg->q->queue_lock) (4) > q->queue_lock = lock (5) > spin_unlock_irq(blkg->q->queue_lock) (6) > > (1) allocate an uninitialized queue; > (2) initialize queue_lock to its default internal lock; > (3) initialize blkcg part of request queue, which will create blkg and > then insert it to blkg_list; > (4) traverse blkg_list and find the created blkg, and then take its > queue lock, here it is the default *internal lock*; > (5) *race window*, now queue_lock is overridden with *driver specified > lock*; > (6) now unlock *driver specified lock*, not the locked *internal lock*, > unlock balance breaks. > > The changes in this patch are as follows: > - Rename blk_alloc_queue_node() into blk_alloc_queue_node2() and add > a new queue lock argument. > - Introduce a wrapper function with the same name and behavior as the > old blk_alloc_queue_node() function. Let's please not do any of that, that's a horrible name to export. It's not like we have hundreds of callers of blk_alloc_queue_node(), just change them to pass in a NULL if they use the queue embedded lock. -- Jens Axboe