On Wed, Dec 18, 2024 at 06:51:31AM +0500, Mikhail Gavrilov wrote: > Hi, > After commit f1be1788a32e I see in the kernel log "possible circular > locking dependency detected" with follow stack trace: > [ 740.877178] ====================================================== > [ 740.877180] WARNING: possible circular locking dependency detected > [ 740.877182] 6.13.0-rc3-f44d154d6e3d+ #392 Tainted: G W L > [ 740.877184] ------------------------------------------------------ > [ 740.877186] btrfs-transacti/839 is trying to acquire lock: > [ 740.877188] ffff888182336a50 > (&q->q_usage_counter(io)#2){++++}-{0:0}, at: __submit_bio+0x335/0x520 > [ 740.877197] > but task is already holding lock: > [ 740.877198] ffff8881826f7048 (btrfs-tree-00){++++}-{4:4}, at: > btrfs_tree_read_lock_nested+0x27/0x170 > [ 740.877205] > which lock already depends on the new lock. > > [ 740.877206] > the existing dependency chain (in reverse order) is: > [ 740.877207] > -> #4 (btrfs-tree-00){++++}-{4:4}: > [ 740.877211] lock_release+0x397/0xd90 > [ 740.877215] up_read+0x1b/0x30 > [ 740.877217] btrfs_search_slot+0x16c9/0x31f0 > [ 740.877220] btrfs_lookup_inode+0xa9/0x360 > [ 740.877222] __btrfs_update_delayed_inode+0x131/0x760 > [ 740.877225] btrfs_async_run_delayed_root+0x4bc/0x630 > [ 740.877226] btrfs_work_helper+0x1b5/0xa50 > [ 740.877228] process_one_work+0x899/0x14b0 > [ 740.877231] worker_thread+0x5e6/0xfc0 > [ 740.877233] kthread+0x2d2/0x3a0 > [ 740.877235] ret_from_fork+0x31/0x70 > [ 740.877238] ret_from_fork_asm+0x1a/0x30 > [ 740.877240] > -> #3 (&delayed_node->mutex){+.+.}-{4:4}: > [ 740.877244] __mutex_lock+0x1ab/0x12c0 > [ 740.877247] __btrfs_release_delayed_node.part.0+0xa0/0xd40 > [ 740.877249] btrfs_evict_inode+0x44d/0xc20 > [ 740.877252] evict+0x3a4/0x840 > [ 740.877255] dispose_list+0xf0/0x1c0 > [ 740.877257] prune_icache_sb+0xe3/0x160 > [ 740.877259] super_cache_scan+0x30d/0x4f0 > [ 740.877261] do_shrink_slab+0x349/0xd60 > [ 740.877264] shrink_slab+0x7a4/0xd20 > [ 740.877266] shrink_one+0x403/0x830 > [ 740.877268] shrink_node+0x2337/0x3a60 > [ 740.877270] balance_pgdat+0xa4f/0x1500 > [ 740.877272] kswapd+0x4f3/0x940 > [ 740.877274] kthread+0x2d2/0x3a0 > [ 740.877276] ret_from_fork+0x31/0x70 > [ 740.877278] ret_from_fork_asm+0x1a/0x30 > [ 740.877280] > -> #2 (fs_reclaim){+.+.}-{0:0}: > [ 740.877283] fs_reclaim_acquire+0xc9/0x110 > [ 740.877286] __kmalloc_noprof+0xeb/0x690 > [ 740.877288] sd_revalidate_disk.isra.0+0x4356/0x8e00 > [ 740.877291] sd_probe+0x869/0xfa0 > [ 740.877293] really_probe+0x1e0/0x8a0 > [ 740.877295] __driver_probe_device+0x18c/0x370 > [ 740.877297] driver_probe_device+0x4a/0x120 > [ 740.877299] __device_attach_driver+0x162/0x270 > [ 740.877300] bus_for_each_drv+0x115/0x1a0 > [ 740.877303] __device_attach_async_helper+0x1a0/0x240 > [ 740.877305] async_run_entry_fn+0x97/0x4f0 > [ 740.877307] process_one_work+0x899/0x14b0 > [ 740.877309] worker_thread+0x5e6/0xfc0 > [ 740.877310] kthread+0x2d2/0x3a0 > [ 740.877312] ret_from_fork+0x31/0x70 > [ 740.877314] ret_from_fork_asm+0x1a/0x30 > [ 740.877316] > -> #1 (&q->limits_lock){+.+.}-{4:4}: > [ 740.877320] __mutex_lock+0x1ab/0x12c0 > [ 740.877321] nvme_update_ns_info_block+0x476/0x2630 [nvme_core] > [ 740.877332] nvme_update_ns_info+0xbe/0xa60 [nvme_core] > [ 740.877339] nvme_alloc_ns+0x1589/0x2c40 [nvme_core] > [ 740.877346] nvme_scan_ns+0x579/0x660 [nvme_core] > [ 740.877353] async_run_entry_fn+0x97/0x4f0 > [ 740.877355] process_one_work+0x899/0x14b0 > [ 740.877357] worker_thread+0x5e6/0xfc0 > [ 740.877358] kthread+0x2d2/0x3a0 > [ 740.877360] ret_from_fork+0x31/0x70 > [ 740.877362] ret_from_fork_asm+0x1a/0x30 > [ 740.877364] > -> #0 (&q->q_usage_counter(io)#2){++++}-{0:0}: This is another deadlock caused by dependency between q->limits_lock and q->q_usage_counter, same with the one under discussion: https://lore.kernel.org/linux-block/20241216080206.2850773-2-ming.lei@xxxxxxxxxx/ The dependency of queue_limits_start_update() over blk_mq_freeze_queue() should be cut. Thanks, Ming