On 2/10/25 7:12 AM, Ming Lei wrote: > On Mon, Feb 10, 2025 at 8:52 AM Shinichiro Kawasaki > <shinichiro.kawasaki@xxxxxxx> wrote: >> >> On Feb 09, 2025 / 20:20, Ming Lei wrote: >>> All block internal state for dealing adding/removing debugfs entries >>> have been removed, and debugfs can sync everything for us in fs level, >>> so don't grab q->debugfs_mutex for adding/removing block internal debugfs >>> entries. >>> >>> Now q->debugfs_mutex is only used for blktrace, meantime move creating >>> queue debugfs dir code out of q->sysfs_lock. Both the two locks are >>> connected with queue freeze IO lock. Then queue freeze IO lock chain >>> with debugfs lock is cut. >>> >>> The following lockdep report can be fixed: >>> >>> https://lore.kernel.org/linux-block/ougniadskhks7uyxguxihgeuh2pv4yaqv4q3emo4gwuolgzdt6@brotly74p6bs/ >>> >>> Follows contexts which adds/removes debugfs entries: >>> >>> - update nr_hw_queues >>> >>> - add/remove disks >>> >>> - elevator switch >>> >>> - blktrace >>> >>> blktrace only adds entries under disk top directory, so we can ignore it, >>> because it can only work iff disk is added. Also nothing overlapped with >>> the other two contex, blktrace context is fine. >>> >>> Elevator switch is only allowed after disk is added, so there isn't race >>> with add/remove disk. blk_mq_update_nr_hw_queues() always restores to >>> previous elevator, so no race between these two. Elevator switch context >>> is fine. >>> >>> So far blk_mq_update_nr_hw_queues() doesn't hold debugfs lock for >>> adding/removing hctx entries, there might be race with add/remove disk, >>> which is just fine in reality: >>> >>> - blk_mq_update_nr_hw_queues() is usually for error recovery, and disk >>> won't be added/removed at the same time >>> >>> - even though there is race between the two contexts, it is just fine, >>> since hctx won't be freed until queue is dead >>> >>> - we never see reports in this area without holding debugfs in >>> blk_mq_update_nr_hw_queues() >>> >>> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> >>> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> >> >> Ming, thank you for this quick action. I applied this series on top of >> v6.14-rc1 kernel and ran the block/002 test case. Unfortunately, still if fails >> occasionally with the lockdep "WARNING: possible circular locking dependency >> detected" below. Now debugfs_mutex is not reported as one of the dependent >> locks, then I think this fix is working as expected. Instead, eq->sysfs_lock >> creates similar dependency. My mere guess is that this patch avoids one >> dependency, but still another dependency is left. > > Indeed, this patch cuts dependency on both q->sysfs_lock and q->debugfs_lock, Glad to see that with this patch we're able to cut the dependency between q->sysfs_lock and q->debugfs_lock. > but elevator ->sysfs_lock isn't covered, :-( > I believe that shall be fixed with the current effort undergoing here: https://lore.kernel.org/all/20250205144506.663819-1-nilay@xxxxxxxxxxxxx/ Thanks, --Nilay