On Sat, 2018-02-03 at 10:51 +0800, Joseph Qi wrote: > Hi Bart, > > On 18/2/3 00:21, Bart Van Assche wrote: > > On Fri, 2018-02-02 at 09:02 +0800, Joseph Qi wrote: > > > We triggered this race when using single queue. I'm not sure if it > > > exists in multi-queue. > > > > Regarding the races between modifying the queue_lock pointer and the code that > > uses that pointer, I think the following construct in blk_cleanup_queue() is > > sufficient to avoid races between the queue_lock pointer assignment and the code > > that executes concurrently with blk_cleanup_queue(): > > > > spin_lock_irq(lock); > > if (q->queue_lock != &q->__queue_lock) > > q->queue_lock = &q->__queue_lock; > > spin_unlock_irq(lock); > > > > IMO, the race also exists. > > blk_cleanup_queue blkcg_print_blkgs > spin_lock_irq(lock) (1) spin_lock_irq(blkg->q->queue_lock) (2,5) > q->queue_lock = &q->__queue_lock (3) > spin_unlock_irq(lock) (4) > spin_unlock_irq(blkg->q->queue_lock) (6) > > (1) take driver lock; > (2) busy loop for driver lock; > (3) override driver lock with internal lock; > (4) unlock driver lock; > (5) can take driver lock now; > (6) but unlock internal lock. > > If we get blkg->q->queue_lock to local first like blk_cleanup_queue, > it indeed can fix the different lock use in lock/unlock. But since > blk_cleanup_queue has overridden queue lock to internal lock now, I'm > afraid we couldn't still use driver lock in blkcg_print_blkgs. (+ Jan Kara) Hello Joseph, That's a good catch. Since modifying all code that accesses the queue_lock pointer and that can race with blk_cleanup_queue() would be too cumbersome I see only one solution, namely making the request queue cgroup and sysfs attributes invisible before the queue_lock pointer is restored. Leaving the debugfs attributes visible while blk_cleanup_queue() is in progress should be fine if the request queue initialization code is modified such that it only modifies the queue_lock pointer for legacy queues. Jan, I think some time ago you objected when I proposed to move code from __blk_release_queue() into blk_cleanup_queue(). Would you be fine with a slightly different approach, namely making block cgroup and sysfs attributes invisible earlier, namely from inside blk_cleanup_queue() instead of from inside __blk_release_queue()? Thanks, Bart.