Re: [PATCH v2 2/2] block: Fix a race between the throttling code and request queue initialization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2018-02-03 at 10:51 +0800, Joseph Qi wrote:
> Hi Bart,
> 
> On 18/2/3 00:21, Bart Van Assche wrote:
> > On Fri, 2018-02-02 at 09:02 +0800, Joseph Qi wrote:
> > > We triggered this race when using single queue. I'm not sure if it
> > > exists in multi-queue.
> > 
> > Regarding the races between modifying the queue_lock pointer and the code that
> > uses that pointer, I think the following construct in blk_cleanup_queue() is
> > sufficient to avoid races between the queue_lock pointer assignment and the code
> > that executes concurrently with blk_cleanup_queue():
> > 
> > 	spin_lock_irq(lock);
> > 	if (q->queue_lock != &q->__queue_lock)
> > 		q->queue_lock = &q->__queue_lock;
> > 	spin_unlock_irq(lock);
> > 
> 
> IMO, the race also exists.
> 
> blk_cleanup_queue                   blkcg_print_blkgs
>   spin_lock_irq(lock) (1)           spin_lock_irq(blkg->q->queue_lock) (2,5)
>     q->queue_lock = &q->__queue_lock (3)
>   spin_unlock_irq(lock) (4)
>                                     spin_unlock_irq(blkg->q->queue_lock) (6)
> 
> (1) take driver lock;
> (2) busy loop for driver lock;
> (3) override driver lock with internal lock;
> (4) unlock driver lock; 
> (5) can take driver lock now;
> (6) but unlock internal lock.
> 
> If we get blkg->q->queue_lock to local first like blk_cleanup_queue,
> it indeed can fix the different lock use in lock/unlock. But since
> blk_cleanup_queue has overridden queue lock to internal lock now, I'm
> afraid we couldn't still use driver lock in blkcg_print_blkgs.

(+ Jan Kara)

Hello Joseph,

That's a good catch. Since modifying all code that accesses the queue_lock
pointer and that can race with blk_cleanup_queue() would be too cumbersome I
see only one solution, namely making the request queue cgroup and sysfs
attributes invisible before the queue_lock pointer is restored. Leaving the
debugfs attributes visible while blk_cleanup_queue() is in progress should
be fine if the request queue initialization code is modified such that it
only modifies the queue_lock pointer for legacy queues. Jan, I think some
time ago you objected when I proposed to move code from __blk_release_queue()
into blk_cleanup_queue(). Would you be fine with a slightly different
approach, namely making block cgroup and sysfs attributes invisible earlier,
namely from inside blk_cleanup_queue() instead of from inside
__blk_release_queue()?

Thanks,

Bart.






[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux