Hi, On 2022-08-29 07:15, Yu Kuai wrote: > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > Currently, wait_barrier() will hold 'resync_lock' to read 'conf->barrier', > and io can't be dispatched until 'barrier' is dropped. > > Since holding the 'barrier' is not common, convert 'resync_lock' to use > seqlock so that holding lock can be avoided in fast path. > > Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx> I've found some lockdep issues starting with this patch in md-next while running mdadm tests (specifically 00raid10 when run about 10 times in a row). I've seen a couple different lock dep errors. The first seems to be reproducible on this patch, then it possibly changes to the second on subsequent patches. Not sure exactly. I haven't dug into it too deeply, but hopefully it can be fixed easily. Logan -- ================================ WARNING: inconsistent lock state 6.0.0-rc2-eid-vmlocalyes-dbg-00023-gfd68041d2fd2 #2604 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. fsck.ext3/1695 [HC0[0]:SC0[0]:HE0:SE1] takes: ffff8881049b0120 (&____s->seqcount#10){+.?.}-{0:0}, at: raid10_read_request+0x21f/0x760 (raid10.c:1134) {IN-SOFTIRQ-W} state was registered at: lock_acquire+0x183/0x440 lower_barrier+0x5e/0xd0 end_sync_request+0x178/0x180 end_sync_write+0x193/0x380 bio_endio+0x346/0x3a0 blk_update_request+0x1eb/0x7c0 blk_mq_end_request+0x30/0x50 lo_complete_rq+0xb7/0x100 blk_complete_reqs+0x77/0x90 blk_done_softirq+0x38/0x40 __do_softirq+0x10c/0x650 run_ksoftirqd+0x48/0x80 smpboot_thread_fn+0x302/0x400 kthread+0x18c/0x1c0 ret_from_fork+0x1f/0x30 irq event stamp: 8930 hardirqs last enabled at (8929): [<ffffffff96df8351>] _raw_spin_unlock_irqrestore+0x31/0x60 hardirqs last disabled at (8930): [<ffffffff96df7fc5>] _raw_spin_lock_irq+0x75/0x90 softirqs last enabled at (6768): [<ffffffff9554970e>] __irq_exit_rcu+0xfe/0x150 softirqs last disabled at (6757): [<ffffffff9554970e>] __irq_exit_rcu+0xfe/0x150 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&____s->seqcount#10); <Interrupt> lock(&____s->seqcount#10); *** DEADLOCK *** 2 locks held by fsck.ext3/1695: #0: ffff8881007d0930 (mapping.invalidate_lock#2){++++}-{3:3}, at: page_cache_ra_unbounded+0xaf/0x250 #1: ffff8881049b0120 (&____s->seqcount#10){+.?.}-{0:0}, at: raid10_read_request+0x21f/0x760 stack backtrace: CPU: 0 PID: 1695 Comm: fsck.ext3 Not tainted 6.0.0-rc2-eid-vmlocalyes-dbg-00023-gfd68041d2fd2 #2604 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5a/0x74 dump_stack+0x10/0x12 print_usage_bug.part.0+0x233/0x246 mark_lock.part.0.cold+0x73/0x14f mark_held_locks+0x71/0xa0 lockdep_hardirqs_on_prepare+0x158/0x230 trace_hardirqs_on+0x34/0x100 _raw_spin_unlock_irq+0x28/0x60 wait_barrier+0x4a6/0x720 raid10.c:1004 raid10_read_request+0x21f/0x760 raid10_make_request+0x2d6/0x2160 md_handle_request+0x3f3/0x5b0 md_submit_bio+0xd9/0x120 __submit_bio+0x9d/0x100 submit_bio_noacct_nocheck+0x1fd/0x470 submit_bio_noacct+0x4c2/0xbb0 submit_bio+0x3f/0xf0 mpage_readahead+0x323/0x3b0 blkdev_readahead+0x15/0x20 read_pages+0x136/0x7a0 page_cache_ra_unbounded+0x18d/0x250 page_cache_ra_order+0x2c9/0x400 ondemand_readahead+0x320/0x730 page_cache_sync_ra+0xa6/0xb0 filemap_get_pages+0x1eb/0xc00 filemap_read+0x1f1/0x770 blkdev_read_iter+0x164/0x310 vfs_read+0x467/0x5a0 __x64_sys_pread64+0x122/0x160 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 -- ====================================================== WARNING: possible circular locking dependency detected 6.0.0-rc2-eid-vmlocalyes-dbg-00027-gcd6aa5181bbb #2600 Not tainted ------------------------------------------------------ systemd-udevd/292 is trying to acquire lock: ffff88817b644170 (&(&conf->resync_lock)->lock){....}-{2:2}, at: wait_barrier+0x4fe/0x770 but task is already holding lock: ffff88817b644120 (&____s->seqcount#11){+.+.}-{0:0}, at: raid10_read_request+0x21f/0x760 raid10.c:1140 wait_barrier() raid10.c:1204 regular_request_wait() which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&____s->seqcount#11){+.+.}-{0:0}: raise_barrier+0xe0/0x300 raid10.c:940 write_seqlock_irq() raid10_sync_request+0x629/0x4750 raid10.c:3689 raise_barrire() md_do_sync.cold+0x8ec/0x1491 md_thread+0x19d/0x2d0 kthread+0x18c/0x1c0 ret_from_fork+0x1f/0x30 -> #0 (&(&conf->resync_lock)->lock){....}-{2:2}: __lock_acquire+0x1cb4/0x3170 lock_acquire+0x183/0x440 _raw_spin_lock_irq+0x4d/0x90 wait_barrier+0x4fe/0x770 raid10_read_request+0x21f/0x760 raid10.c:1140 wait_barrier() raid10.c:1204 regular_request_wait() raid10_make_request+0x2d6/0x2190 md_handle_request+0x3f3/0x5b0 md_submit_bio+0xd9/0x120 __submit_bio+0x9d/0x100 submit_bio_noacct_nocheck+0x1fd/0x470 submit_bio_noacct+0x4c2/0xbb0 submit_bio+0x3f/0xf0 submit_bh_wbc+0x270/0x2a0 block_read_full_folio+0x37c/0x580 blkdev_read_folio+0x18/0x20 filemap_read_folio+0x3f/0x110 do_read_cache_folio+0x13b/0x2c0 read_cache_folio+0x42/0x50 read_part_sector+0x74/0x1c0 read_lba+0x176/0x2a0 efi_partition+0x1ce/0xdd0 bdev_disk_changed+0x2e7/0x6a0 blkdev_get_whole+0xd2/0x140 blkdev_get_by_dev.part.0+0x37f/0x570 blkdev_get_by_dev+0x51/0x60 disk_scan_partitions+0xad/0xf0 blkdev_common_ioctl+0x3f3/0xdf0 blkdev_ioctl+0x1e1/0x450 __x64_sys_ioctl+0xc0/0x100 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&____s->seqcount#11); lock(&(&conf->resync_lock)->lock); lock(&____s->seqcount#11); lock(&(&conf->resync_lock)->lock); *** DEADLOCK *** 2 locks held by systemd-udevd/292: #0: ffff88817a532528 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev.part.0+0x180/0x570 #1: ffff88817b644120 (&____s->seqcount#11){+.+.}-{0:0}, at: raid10_read_request+0x21f/0x760 stack backtrace: CPU: 3 PID: 292 Comm: systemd-udevd Not tainted 6.0.0-rc2-eid-vmlocalyes-dbg-00027-gcd6aa5181bbb #2600 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5a/0x74 dump_stack+0x10/0x12 print_circular_bug.cold+0x146/0x14b check_noncircular+0x1ff/0x250 __lock_acquire+0x1cb4/0x3170 lock_acquire+0x183/0x440 _raw_spin_lock_irq+0x4d/0x90 wait_barrier+0x4fe/0x770 raid10_read_request+0x21f/0x760 raid10_make_request+0x2d6/0x2190 md_handle_request+0x3f3/0x5b0 md_submit_bio+0xd9/0x120 __submit_bio+0x9d/0x100 submit_bio_noacct_nocheck+0x1fd/0x470 submit_bio_noacct+0x4c2/0xbb0 submit_bio+0x3f/0xf0 submit_bh_wbc+0x270/0x2a0 block_read_full_folio+0x37c/0x580 blkdev_read_folio+0x18/0x20 filemap_read_folio+0x3f/0x110 do_read_cache_folio+0x13b/0x2c0 read_cache_folio+0x42/0x50 read_part_sector+0x74/0x1c0 read_lba+0x176/0x2a0 efi_partition+0x1ce/0xdd0 bdev_disk_changed+0x2e7/0x6a0 blkdev_get_whole+0xd2/0x140 blkdev_get_by_dev.part.0+0x37f/0x570 blkdev_get_by_dev+0x51/0x60 disk_scan_partitions+0xad/0xf0 blkdev_common_ioctl+0x3f3/0xdf0 blkdev_ioctl+0x1e1/0x450 __x64_sys_ioctl+0xc0/0x100 do_syscall_64+0x35/0x80