Re: [PATCH 1/2] block: avoid to hold q->limits_lock across APIs for atomic update queue limits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 18, 2024 at 05:03:00PM +0530, Nilay Shroff wrote:
> 
> 
> On 12/18/24 07:39, Ming Lei wrote:
> > On Tue, Dec 17, 2024 at 08:07:06AM -0800, Damien Le Moal wrote:
> >> On 2024/12/16 23:30, Ming Lei wrote:
> >>> On Tue, Dec 17, 2024 at 08:19:28AM +0100, Christoph Hellwig wrote:
> >>>> On Tue, Dec 17, 2024 at 03:05:48PM +0800, Ming Lei wrote:
> >>>>> On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote:
> >>>>>> On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote:
> >>>>>>> The local copy can be updated in any way with any data, so does another
> >>>>>>> concurrent update on q->limits really matter?
> >>>>>>
> >>>>>> Yes, because that means one of the updates get lost even if it is
> >>>>>> for entirely separate fields.
> >>>>>
> >>>>> Right, but the limits are still valid anytime.
> >>>>>
> >>>>> Any suggestion for fixing this deadlock?
> >>>>
> >>>> What is "this deadlock"?
> >>>
> >>> The commit log provides two reports:
> >>>
> >>> - lockdep warning
> >>>
> >>> https://lore.kernel.org/linux-block/Z1A8fai9_fQFhs1s@xxxxxxxxxxxxxxxxxxxx/
> >>>
> >>> - real deadlock report
> >>>
> >>> https://lore.kernel.org/linux-scsi/ZxG38G9BuFdBpBHZ@fedora/
> >>>
> >>> It is actually one simple ABBA lock:
> >>>
> >>> 1) queue_attr_store()
> >>>
> >>>       blk_mq_freeze_queue(q);					//queue freeze lock
> >>>       res = entry->store(disk, page, length);
> >>> 	  			queue_limits_start_update		//->limits_lock
> >>> 				...
> >>> 				queue_limits_commit_update
> >>>       blk_mq_unfreeze_queue(q);
> >>
> >> The locking + freeze pattern should be:
> >>
> >> 	lim = queue_limits_start_update(q);
> >> 	...
> >> 	blk_mq_freeze_queue(q);
> >> 	ret = queue_limits_commit_update(q, &lim);
> >> 	blk_mq_unfreeze_queue(q);
> >>
> >> This pattern is used in most places and anything that does not use it is likely
> >> susceptible to a similar ABBA deadlock. We should probably look into trying to
> >> integrate the freeze/unfreeze calls directly into queue_limits_commit_update().
> >>
> >> Fixing queue_attr_store() to use this pattern seems simpler than trying to fix
> >> sd_revalidate_disk().
> > 
> > This way looks good, just commit af2814149883 ("block: freeze the queue in
> > queue_attr_store") needs to be reverted, and freeze/unfreeze has to be
> > added to each queue attribute .store() handler.
> > 
> Wouldn't it be feasible to add blk-mq freeze in queue_limits_start_update()
> and blk-mq unfreeze in queue_limits_commit_update()? If we do this then 
> the pattern would be, 
> 
> queue_limits_start_update(): limit-lock + freeze
> queue_limits_commit_update() : unfreeze + limit-unlock  
> 
> Then in queue_attr_store() we shall just remove freeze/unfreeze.
> 
> We also need to fix few call sites where we've code block,
> 
> {
>     blk_mq_freeze_queue()
>     ...
>     queue_limits_start_update()
>     ...    
>     queue_limits_commit_update()
>     ...
>     blk_mq_unfreeze_queue()
>     
> }
> 
> In the above code block, we may then replace blk_mq_freeze_queue() with
> queue_limits_commit_update() and similarly replace blk_mq_unfreeze_queue() 
> with queue_limits_commit_update().
> 
> {
>     queue_limits_start_update()
>     ...
>     ...
>     ...
>     queue_limits_commit_update()

In sd_revalidate_disk(), blk-mq request is allocated under queue_limits_start_update(),
then ABBA deadlock is triggered since blk_queue_enter() implies same lock(freeze lock)
from blk_mq_freeze_queue().


Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux