Hi, 在 2024/05/15 19:57, Gustav Ekelund 写道:
Hi, With raid5 syncing and ext4lazyinit running in parallel, I have a high probability of hanging on the 6.1.55 kernel (Log from blocked tasks below). I do not see this problem on the 5.10 kernel. In thread [4] patch [2] is described an regression going from 6.7 to 6.7.1, so it is unclear to me if this is the same issue. Let me know if I should reply on [4] if you think this could be the same issue. Cherry-picking [2] into 6.1 seems to resolve the hang, but following your discussion in [4] you later revert this patch in [3]. I tried to follow the thread, but I cannot figure out which patch is suggested to be used instead of [2]. Would you advice against running with [2] on v6.1? Should it be used in combination with [1] in that case?
No, you should try this patch: https://lore.kernel.org/all/20240322081005.1112401-1-yukuai1@xxxxxxxxxxxxxxx/ Thanks, Kuai
Best regards Gustav [1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update") [2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"") [3] commit 3445139e3a59 ("Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""") [4] https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@xxxxxxxx/ <6>[ 5487.973655][ T9272] sysrq: Show Blocked State <6>[ 5487.974388][ T9272] task:md127_raid5 state:D stack:0 pid:2619 ppid:2 flags:0x00000008 <6>[ 5487.983896][ T9272] Call trace: <6>[ 5487.987135][ T9272] __switch_to+0xc0/0x100 <6>[ 5487.991406][ T9272] __schedule+0x2a0/0x6b0 <6>[ 5487.995742][ T9272] schedule+0x54/0xb4 <6>[ 5487.999658][ T9272] raid5d+0x358/0x56c <6>[ 5488.003576][ T9272] md_thread+0xa8/0x15c <6>[ 5488.007723][ T9272] kthread+0x104/0x110 <6>[ 5488.011725][ T9272] ret_from_fork+0x10/0x20 <6>[ 5488.016080][ T9272] task:md127_resync state:D stack:0 pid:2620 ppid:2 flags:0x00000008 <6>[ 5488.025278][ T9272] Call trace: <6>[ 5488.028491][ T9272] __switch_to+0xc0/0x100 <6>[ 5488.032813][ T9272] __schedule+0x2a0/0x6b0 <6>[ 5488.037075][ T9272] schedule+0x54/0xb4 <6>[ 5488.041047][ T9272] raid5_get_active_stripe+0x1f4/0x454 <6>[ 5488.046441][ T9272] raid5_sync_request+0x350/0x390 <6>[ 5488.051401][ T9272] md_do_sync+0x8ac/0xcc4 <6>[ 5488.055722][ T9272] md_thread+0xa8/0x15c <6>[ 5488.059812][ T9272] kthread+0x104/0x110 <6>[ 5488.063814][ T9272] ret_from_fork+0x10/0x20 <6>[ 5488.068225][ T9272] task:jbd2/md127-8 state:D stack:0 pid:2675 ppid:2 flags:0x00000008 <6>[ 5488.077425][ T9272] Call trace: <6>[ 5488.080641][ T9272] __switch_to+0xc0/0x100 <6>[ 5488.084906][ T9272] __schedule+0x2a0/0x6b0 <6>[ 5488.089221][ T9272] schedule+0x54/0xb4 <6>[ 5488.093135][ T9272] md_write_start+0xfc/0x360 <6>[ 5488.097676][ T9272] raid5_make_request+0x68/0x117c <6>[ 5488.102695][ T9272] md_handle_request+0x21c/0x354 <6>[ 5488.107565][ T9272] md_submit_bio+0x74/0xb0 <6>[ 5488.111987][ T9272] __submit_bio+0x100/0x27c <6>[ 5488.116432][ T9272] submit_bio_noacct_nocheck+0xdc/0x260 <6>[ 5488.121910][ T9272] submit_bio_noacct+0x128/0x2e4 <6>[ 5488.126840][ T9272] submit_bio+0x34/0xdc <6>[ 5488.130935][ T9272] submit_bh_wbc+0x120/0x170 <6>[ 5488.135521][ T9272] submit_bh+0x14/0x20 <6>[ 5488.139527][ T9272] jbd2_journal_commit_transaction+0xccc/0x1520 [jbd2] <6>[ 5488.146400][ T9272] kjournald2+0xb0/0x250 [jbd2] <6>[ 5488.151194][ T9272] kthread+0x104/0x110 <6>[ 5488.155198][ T9272] ret_from_fork+0x10/0x20 <6>[ 5488.159608][ T9272] task:ext4lazyinit state:D stack:0 pid:2677 ppid:2 flags:0x00000008 <6>[ 5488.168811][ T9272] Call trace: <6>[ 5488.172026][ T9272] __switch_to+0xc0/0x100 <6>[ 5488.176291][ T9272] __schedule+0x2a0/0x6b0 <6>[ 5488.180618][ T9272] schedule+0x54/0xb4 <6>[ 5488.184538][ T9272] io_schedule+0x3c/0x60 <6>[ 5488.188714][ T9272] bit_wait_io+0x18/0x70 <6>[ 5488.192947][ T9272] __wait_on_bit+0x50/0x170 <6>[ 5488.197384][ T9272] out_of_line_wait_on_bit+0x74/0x80 <6>[ 5488.202604][ T9272] do_get_write_access+0x1e4/0x3c0 [jbd2] <6>[ 5488.208326][ T9272] jbd2_journal_get_write_access+0x80/0xc0 [jbd2] <6>[ 5488.214683][ T9272] __ext4_journal_get_write_access+0x80/0x1a4 [ext4] <6>[ 5488.221392][ T9272] ext4_init_inode_table+0x228/0x3d0 [ext4] <6>[ 5488.227298][ T9272] ext4_lazyinit_thread+0x410/0x5f4 [ext4] <6>[ 5488.233066][ T9272] kthread+0x104/0x110 <6>[ 5488.237069][ T9272] ret_from_fork+0x10/0x20 .