Re: [PATCH 1/2] block: avoid to hold q->limits_lock across APIs for atomic update queue limits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 17, 2024 at 08:07:06AM -0800, Damien Le Moal wrote:
> On 2024/12/16 23:30, Ming Lei wrote:
> > On Tue, Dec 17, 2024 at 08:19:28AM +0100, Christoph Hellwig wrote:
> >> On Tue, Dec 17, 2024 at 03:05:48PM +0800, Ming Lei wrote:
> >>> On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote:
> >>>> On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote:
> >>>>> The local copy can be updated in any way with any data, so does another
> >>>>> concurrent update on q->limits really matter?
> >>>>
> >>>> Yes, because that means one of the updates get lost even if it is
> >>>> for entirely separate fields.
> >>>
> >>> Right, but the limits are still valid anytime.
> >>>
> >>> Any suggestion for fixing this deadlock?
> >>
> >> What is "this deadlock"?
> > 
> > The commit log provides two reports:
> > 
> > - lockdep warning
> > 
> > https://lore.kernel.org/linux-block/Z1A8fai9_fQFhs1s@xxxxxxxxxxxxxxxxxxxx/
> > 
> > - real deadlock report
> > 
> > https://lore.kernel.org/linux-scsi/ZxG38G9BuFdBpBHZ@fedora/
> > 
> > It is actually one simple ABBA lock:
> > 
> > 1) queue_attr_store()
> > 
> >       blk_mq_freeze_queue(q);					//queue freeze lock
> >       res = entry->store(disk, page, length);
> > 	  			queue_limits_start_update		//->limits_lock
> > 				...
> > 				queue_limits_commit_update
> >       blk_mq_unfreeze_queue(q);
> 
> The locking + freeze pattern should be:
> 
> 	lim = queue_limits_start_update(q);
> 	...
> 	blk_mq_freeze_queue(q);
> 	ret = queue_limits_commit_update(q, &lim);
> 	blk_mq_unfreeze_queue(q);
> 
> This pattern is used in most places and anything that does not use it is likely
> susceptible to a similar ABBA deadlock. We should probably look into trying to
> integrate the freeze/unfreeze calls directly into queue_limits_commit_update().
> 
> Fixing queue_attr_store() to use this pattern seems simpler than trying to fix
> sd_revalidate_disk().

This way looks good, just commit af2814149883 ("block: freeze the queue in
queue_attr_store") needs to be reverted, and freeze/unfreeze has to be
added to each queue attribute .store() handler.



Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux