On 12/18/24 07:39, Ming Lei wrote: > On Tue, Dec 17, 2024 at 08:07:06AM -0800, Damien Le Moal wrote: >> On 2024/12/16 23:30, Ming Lei wrote: >>> On Tue, Dec 17, 2024 at 08:19:28AM +0100, Christoph Hellwig wrote: >>>> On Tue, Dec 17, 2024 at 03:05:48PM +0800, Ming Lei wrote: >>>>> On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote: >>>>>> On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote: >>>>>>> The local copy can be updated in any way with any data, so does another >>>>>>> concurrent update on q->limits really matter? >>>>>> >>>>>> Yes, because that means one of the updates get lost even if it is >>>>>> for entirely separate fields. >>>>> >>>>> Right, but the limits are still valid anytime. >>>>> >>>>> Any suggestion for fixing this deadlock? >>>> >>>> What is "this deadlock"? >>> >>> The commit log provides two reports: >>> >>> - lockdep warning >>> >>> https://lore.kernel.org/linux-block/Z1A8fai9_fQFhs1s@xxxxxxxxxxxxxxxxxxxx/ >>> >>> - real deadlock report >>> >>> https://lore.kernel.org/linux-scsi/ZxG38G9BuFdBpBHZ@fedora/ >>> >>> It is actually one simple ABBA lock: >>> >>> 1) queue_attr_store() >>> >>> blk_mq_freeze_queue(q); //queue freeze lock >>> res = entry->store(disk, page, length); >>> queue_limits_start_update //->limits_lock >>> ... >>> queue_limits_commit_update >>> blk_mq_unfreeze_queue(q); >> >> The locking + freeze pattern should be: >> >> lim = queue_limits_start_update(q); >> ... >> blk_mq_freeze_queue(q); >> ret = queue_limits_commit_update(q, &lim); >> blk_mq_unfreeze_queue(q); >> >> This pattern is used in most places and anything that does not use it is likely >> susceptible to a similar ABBA deadlock. We should probably look into trying to >> integrate the freeze/unfreeze calls directly into queue_limits_commit_update(). >> >> Fixing queue_attr_store() to use this pattern seems simpler than trying to fix >> sd_revalidate_disk(). > > This way looks good, just commit af2814149883 ("block: freeze the queue in > queue_attr_store") needs to be reverted, and freeze/unfreeze has to be > added to each queue attribute .store() handler. > Wouldn't it be feasible to add blk-mq freeze in queue_limits_start_update() and blk-mq unfreeze in queue_limits_commit_update()? If we do this then the pattern would be, queue_limits_start_update(): limit-lock + freeze queue_limits_commit_update() : unfreeze + limit-unlock Then in queue_attr_store() we shall just remove freeze/unfreeze. We also need to fix few call sites where we've code block, { blk_mq_freeze_queue() ... queue_limits_start_update() ... queue_limits_commit_update() ... blk_mq_unfreeze_queue() } In the above code block, we may then replace blk_mq_freeze_queue() with queue_limits_commit_update() and similarly replace blk_mq_unfreeze_queue() with queue_limits_commit_update(). { queue_limits_start_update() ... ... ... queue_limits_commit_update() } Thanks, --Nilay