Re: raid5 hang on kernel v6.1 in combination with ext4lazyinit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2024/05/15 19:57, Gustav Ekelund 写道:
Hi,

With raid5 syncing and ext4lazyinit running in parallel, I have a high
probability of hanging on the 6.1.55 kernel (Log from blocked tasks
below). I do not see this problem on the 5.10 kernel.

In thread [4] patch [2] is described an regression going from 6.7 to
6.7.1, so it is unclear to me if this is the same issue. Let me know if
I should reply on [4] if you think this could be the same issue.

Cherry-picking [2] into 6.1 seems to resolve the hang, but following
your discussion in [4] you later revert this patch in [3]. I tried to
follow the thread, but I cannot figure out which patch is suggested to
be used instead of [2].

Would you advice against running with [2] on v6.1? Should it be used in
combination with [1] in that case?

No, you should try this patch:

https://lore.kernel.org/all/20240322081005.1112401-1-yukuai1@xxxxxxxxxxxxxxx/

Thanks,
Kuai


Best regards
Gustav

[1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update")
[2] commit bed9e27baf52 ("Revert "md/raid5: Wait for
MD_SB_CHANGE_PENDING in raid5d"")
[3] commit 3445139e3a59 ("Revert "Revert "md/raid5: Wait for
MD_SB_CHANGE_PENDING in raid5d""")
[4] https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@xxxxxxxx/

<6>[ 5487.973655][ T9272] sysrq: Show Blocked State
<6>[ 5487.974388][ T9272] task:md127_raid5     state:D stack:0
pid:2619  ppid:2      flags:0x00000008
<6>[ 5487.983896][ T9272] Call trace:
<6>[ 5487.987135][ T9272]  __switch_to+0xc0/0x100
<6>[ 5487.991406][ T9272]  __schedule+0x2a0/0x6b0
<6>[ 5487.995742][ T9272]  schedule+0x54/0xb4
<6>[ 5487.999658][ T9272]  raid5d+0x358/0x56c
<6>[ 5488.003576][ T9272]  md_thread+0xa8/0x15c
<6>[ 5488.007723][ T9272]  kthread+0x104/0x110
<6>[ 5488.011725][ T9272]  ret_from_fork+0x10/0x20
<6>[ 5488.016080][ T9272] task:md127_resync    state:D stack:0
pid:2620  ppid:2      flags:0x00000008
<6>[ 5488.025278][ T9272] Call trace:
<6>[ 5488.028491][ T9272]  __switch_to+0xc0/0x100
<6>[ 5488.032813][ T9272]  __schedule+0x2a0/0x6b0
<6>[ 5488.037075][ T9272]  schedule+0x54/0xb4
<6>[ 5488.041047][ T9272]  raid5_get_active_stripe+0x1f4/0x454
<6>[ 5488.046441][ T9272]  raid5_sync_request+0x350/0x390
<6>[ 5488.051401][ T9272]  md_do_sync+0x8ac/0xcc4
<6>[ 5488.055722][ T9272]  md_thread+0xa8/0x15c
<6>[ 5488.059812][ T9272]  kthread+0x104/0x110
<6>[ 5488.063814][ T9272]  ret_from_fork+0x10/0x20
<6>[ 5488.068225][ T9272] task:jbd2/md127-8    state:D stack:0
pid:2675  ppid:2      flags:0x00000008
<6>[ 5488.077425][ T9272] Call trace:
<6>[ 5488.080641][ T9272]  __switch_to+0xc0/0x100
<6>[ 5488.084906][ T9272]  __schedule+0x2a0/0x6b0
<6>[ 5488.089221][ T9272]  schedule+0x54/0xb4
<6>[ 5488.093135][ T9272]  md_write_start+0xfc/0x360
<6>[ 5488.097676][ T9272]  raid5_make_request+0x68/0x117c
<6>[ 5488.102695][ T9272]  md_handle_request+0x21c/0x354
<6>[ 5488.107565][ T9272]  md_submit_bio+0x74/0xb0
<6>[ 5488.111987][ T9272]  __submit_bio+0x100/0x27c
<6>[ 5488.116432][ T9272]  submit_bio_noacct_nocheck+0xdc/0x260
<6>[ 5488.121910][ T9272]  submit_bio_noacct+0x128/0x2e4
<6>[ 5488.126840][ T9272]  submit_bio+0x34/0xdc
<6>[ 5488.130935][ T9272]  submit_bh_wbc+0x120/0x170
<6>[ 5488.135521][ T9272]  submit_bh+0x14/0x20
<6>[ 5488.139527][ T9272]  jbd2_journal_commit_transaction+0xccc/0x1520
[jbd2]
<6>[ 5488.146400][ T9272]  kjournald2+0xb0/0x250 [jbd2]
<6>[ 5488.151194][ T9272]  kthread+0x104/0x110
<6>[ 5488.155198][ T9272]  ret_from_fork+0x10/0x20
<6>[ 5488.159608][ T9272] task:ext4lazyinit    state:D stack:0
pid:2677  ppid:2      flags:0x00000008
<6>[ 5488.168811][ T9272] Call trace:
<6>[ 5488.172026][ T9272]  __switch_to+0xc0/0x100
<6>[ 5488.176291][ T9272]  __schedule+0x2a0/0x6b0
<6>[ 5488.180618][ T9272]  schedule+0x54/0xb4
<6>[ 5488.184538][ T9272]  io_schedule+0x3c/0x60
<6>[ 5488.188714][ T9272]  bit_wait_io+0x18/0x70
<6>[ 5488.192947][ T9272]  __wait_on_bit+0x50/0x170
<6>[ 5488.197384][ T9272]  out_of_line_wait_on_bit+0x74/0x80
<6>[ 5488.202604][ T9272]  do_get_write_access+0x1e4/0x3c0 [jbd2]
<6>[ 5488.208326][ T9272]  jbd2_journal_get_write_access+0x80/0xc0 [jbd2]
<6>[ 5488.214683][ T9272]  __ext4_journal_get_write_access+0x80/0x1a4 [ext4]
<6>[ 5488.221392][ T9272]  ext4_init_inode_table+0x228/0x3d0 [ext4]
<6>[ 5488.227298][ T9272]  ext4_lazyinit_thread+0x410/0x5f4 [ext4]
<6>[ 5488.233066][ T9272]  kthread+0x104/0x110
<6>[ 5488.237069][ T9272]  ret_from_fork+0x10/0x20

.






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux