Re: [bug-report] Soft lockup when running Raid5 periodic check and light workload, kernel 6.8.7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2024/06/03 19:09, Darius Ski 写道:
Hi all,

we have 5 device RAID5 array of SSD disks and yesterday ( during
Debian monthly cron mdadm/checkarray check ) experienced lockup under
light writes from our FileWrServer.
Array HW was fine before, during and after, just the resync, our app
threads were locked up completely.
The system completed such check during first Sunday of April and had
no troubles reading/writing 20+ terabytes of data during system prep.

cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid5 nvme8n1p1[5] nvme7n1p1[3] nvme0n1p1[0] nvme4n1p1[2]
nvme2n1p1[1]
       120022794240 blocks super 1.2 level 5, 64k chunk, algorithm 2
[5/5] [UUUUU]
       bitmap: 18/224 pages [72KB], 65536KB chunk

[Sun Jun  2 16:38:36 2024] INFO: task md0_raid5:1545 blocked for more
than 120 seconds.
[Sun Jun  2 16:38:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:38:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:38:36 2024] task:md0_raid5       state:D stack:0
pid:1545  tgid:1545  ppid:2      flags:0x00004000
[Sun Jun  2 16:38:36 2024] Call Trace:
[Sun Jun  2 16:38:36 2024]  <TASK>
[Sun Jun  2 16:38:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:38:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:38:36 2024]  raid5d+0x466/0x670
[Sun Jun  2 16:38:36 2024]  ? __schedule+0x372/0x9a0
[Sun Jun  2 16:38:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:38:36 2024]  ? super_90_load.part.0+0x340/0x340
[Sun Jun  2 16:38:36 2024]  md_thread+0x94/0x120
[Sun Jun  2 16:38:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:38:36 2024]  kthread+0xc3/0xf0
[Sun Jun  2 16:38:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:38:36 2024]  ret_from_fork+0x2d/0x50
[Sun Jun  2 16:38:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:38:36 2024]  ret_from_fork_asm+0x11/0x20
[Sun Jun  2 16:38:36 2024]  </TASK>
[Sun Jun  2 16:38:36 2024] INFO: task jbd2/md0p1-8:1963 blocked for
more than 120 seconds.
[Sun Jun  2 16:38:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:38:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:38:36 2024] task:jbd2/md0p1-8    state:D stack:0
pid:1963  tgid:1963  ppid:2      flags:0x00004000
[Sun Jun  2 16:38:36 2024] Call Trace:
[Sun Jun  2 16:38:36 2024]  <TASK>
[Sun Jun  2 16:38:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:38:36 2024]  ? __queue_work+0x19f/0x3b0
[Sun Jun  2 16:38:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:38:36 2024]  md_write_start+0x10d/0x200
[Sun Jun  2 16:38:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:38:36 2024]  raid5_make_request+0x8f/0x1220
[Sun Jun  2 16:38:36 2024]  ? xas_store+0x2de/0x610
[Sun Jun  2 16:38:36 2024]  ? xas_load+0x2c/0x40
[Sun Jun  2 16:38:36 2024]  ? sched_feat_write+0x150/0x150
[Sun Jun  2 16:38:36 2024]  ? alloc_buffer_head+0x1a/0x60
[Sun Jun  2 16:38:36 2024]  ? filemap_get_entry+0x53/0x100
[Sun Jun  2 16:38:36 2024]  md_handle_request+0x132/0x210
[Sun Jun  2 16:38:36 2024]  ? __bio_split_to_limits+0x8c/0x250
[Sun Jun  2 16:38:36 2024]  ? bio_split_to_limits+0x3d/0x60
[Sun Jun  2 16:38:36 2024]  __submit_bio+0x46/0xd0
[Sun Jun  2 16:38:36 2024]  submit_bio_noacct_nocheck+0x11b/0x330
[Sun Jun  2 16:38:36 2024]  jbd2_journal_commit_transaction+0xc7a/0x1930
[Sun Jun  2 16:38:36 2024]  ? __schedule+0x372/0x9a0
[Sun Jun  2 16:38:36 2024]  kjournald2+0x95/0x220
[Sun Jun  2 16:38:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:38:36 2024]  ? jbd2_fc_wait_bufs+0x90/0x90
[Sun Jun  2 16:38:36 2024]  kthread+0xc3/0xf0
[Sun Jun  2 16:38:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:38:36 2024]  ret_from_fork+0x2d/0x50
[Sun Jun  2 16:38:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:38:36 2024]  ret_from_fork_asm+0x11/0x20
[Sun Jun  2 16:38:36 2024]  </TASK>
[Sun Jun  2 16:38:36 2024] INFO: task FileWrServerEve:2020854 blocked
for more than 120 seconds.
[Sun Jun  2 16:38:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:38:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:38:36 2024] task:FileWrServerEve state:D stack:0
pid:2020854 tgid:2020739 ppid:1      flags:0x00000002
[Sun Jun  2 16:38:36 2024] Call Trace:
[Sun Jun  2 16:38:36 2024]  <TASK>
[Sun Jun  2 16:38:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:38:36 2024]  ? ext4_getblk+0xb7/0x2c0
[Sun Jun  2 16:38:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:38:36 2024]  io_schedule+0x42/0x70
[Sun Jun  2 16:38:36 2024]  bit_wait_io+0xd/0x60
[Sun Jun  2 16:38:36 2024]  __wait_on_bit+0x48/0x140
[Sun Jun  2 16:38:36 2024]  ? bit_wait+0x60/0x60
[Sun Jun  2 16:38:36 2024]  out_of_line_wait_on_bit+0x81/0x90
[Sun Jun  2 16:38:36 2024]  ? pick_next_task_stop+0x70/0x70
[Sun Jun  2 16:38:36 2024]  do_get_write_access+0x235/0x3a0
[Sun Jun  2 16:38:36 2024]  jbd2_journal_get_write_access+0x88/0xb0
[Sun Jun  2 16:38:36 2024]  __ext4_journal_get_write_access+0x3e/0x160
[Sun Jun  2 16:38:36 2024]  __ext4_new_inode+0x5a3/0x1730
[Sun Jun  2 16:38:36 2024]  ext4_create+0xe9/0x1a0
[Sun Jun  2 16:38:36 2024]  path_openat+0xdca/0x1080
[Sun Jun  2 16:38:36 2024]  do_filp_open+0xa1/0x130
[Sun Jun  2 16:38:36 2024]  do_sys_openat2+0x74/0xa0
[Sun Jun  2 16:38:36 2024]  __x64_sys_openat+0x5c/0x70
[Sun Jun  2 16:38:36 2024]  do_syscall_64+0x3a/0xc0
[Sun Jun  2 16:38:36 2024]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[Sun Jun  2 16:38:36 2024] RIP: 0033:0x7f6b89289f80
[Sun Jun  2 16:38:36 2024] RSP: 002b:00007f6b5962d540 EFLAGS: 00000293
ORIG_RAX: 0000000000000101
[Sun Jun  2 16:38:36 2024] RAX: ffffffffffffffda RBX: 00000000000000c1
RCX: 00007f6b89289f80
[Sun Jun  2 16:38:36 2024] RDX: 00000000000000c1 RSI: 00007f6a6c5d1ee0
RDI: 00000000ffffff9c
[Sun Jun  2 16:38:36 2024] RBP: 00007f6a6c5d1ee0 R08: 0000000000000000
R09: 00000000fb82ea15
[Sun Jun  2 16:38:36 2024] R10: 00000000000081a4 R11: 0000000000000293
R12: 00000000000081a4
[Sun Jun  2 16:38:36 2024] R13: 00000000000000c1 R14: 00007f6a6c5d1ee0
R15: 00007f6b80ec9080
[Sun Jun  2 16:38:36 2024]  </TASK>
[Sun Jun  2 16:38:36 2024] INFO: task md0_resync:716263 blocked for
more than 120 seconds.
[Sun Jun  2 16:38:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:38:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:38:36 2024] task:md0_resync      state:D stack:0
pid:716263 tgid:716263 ppid:2      flags:0x00004000
[Sun Jun  2 16:38:36 2024] Call Trace:
[Sun Jun  2 16:38:36 2024]  <TASK>
[Sun Jun  2 16:38:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:38:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:38:36 2024]  raid5_get_active_stripe+0x1f9/0x4c0
[Sun Jun  2 16:38:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:38:36 2024]  raid5_sync_request+0x34f/0x370
[Sun Jun  2 16:38:36 2024]  ? is_mddev_idle+0xe7/0x150
[Sun Jun  2 16:38:36 2024]  md_do_sync+0x668/0xfa0
[Sun Jun  2 16:38:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:38:36 2024]  ? super_90_load.part.0+0x340/0x340
[Sun Jun  2 16:38:36 2024]  md_thread+0x94/0x120
[Sun Jun  2 16:38:36 2024]  kthread+0xc3/0xf0
[Sun Jun  2 16:38:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:38:36 2024]  ret_from_fork+0x2d/0x50
[Sun Jun  2 16:38:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:38:36 2024]  ret_from_fork_asm+0x11/0x20
[Sun Jun  2 16:38:36 2024]  </TASK>
[Sun Jun  2 16:38:36 2024] INFO: task kworker/u385:4:1561579 blocked
for more than 120 seconds.
[Sun Jun  2 16:38:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:38:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:38:36 2024] task:kworker/u385:4  state:D stack:0
pid:1561579 tgid:1561579 ppid:2      flags:0x00004000
[Sun Jun  2 16:38:36 2024] Workqueue: writeback wb_workfn (flush-9:0)
[Sun Jun  2 16:38:36 2024] Call Trace:
[Sun Jun  2 16:38:36 2024]  <TASK>
[Sun Jun  2 16:38:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:38:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:38:36 2024]  io_schedule+0x42/0x70
[Sun Jun  2 16:38:36 2024]  bit_wait_io+0xd/0x60
[Sun Jun  2 16:38:36 2024]  __wait_on_bit+0x48/0x140
[Sun Jun  2 16:38:36 2024]  ? bit_wait+0x60/0x60
[Sun Jun  2 16:38:36 2024]  out_of_line_wait_on_bit+0x81/0x90
[Sun Jun  2 16:38:36 2024]  ? pick_next_task_stop+0x70/0x70
[Sun Jun  2 16:38:36 2024]  do_get_write_access+0x235/0x3a0
[Sun Jun  2 16:38:36 2024]  jbd2_journal_get_write_access+0x88/0xb0
[Sun Jun  2 16:38:36 2024]  __ext4_journal_get_write_access+0x3e/0x160
[Sun Jun  2 16:38:36 2024]  ext4_mb_mark_context+0x93/0x390
[Sun Jun  2 16:38:36 2024]  ext4_mb_mark_diskspace_used+0xba/0x190
[Sun Jun  2 16:38:36 2024]  ext4_mb_new_blocks+0x13f/0xdc0
[Sun Jun  2 16:38:36 2024]  ? ext4_find_extent+0x330/0x420
[Sun Jun  2 16:38:36 2024]  ? ext4_find_extent+0x3bf/0x420
[Sun Jun  2 16:38:36 2024]  ? release_pages+0x141/0x3b0
[Sun Jun  2 16:38:36 2024]  ext4_ext_map_blocks+0x35b/0x1790
[Sun Jun  2 16:38:36 2024]  ? release_pages+0x141/0x3b0
[Sun Jun  2 16:38:36 2024]  ? filemap_get_folios_tag+0x6c/0x1b0
[Sun Jun  2 16:38:36 2024]  ? mpage_prepare_extent_to_map+0x439/0x470
[Sun Jun  2 16:38:36 2024]  ext4_map_blocks+0x164/0x5d0
[Sun Jun  2 16:38:36 2024]  ext4_do_writepages+0x694/0xba0
[Sun Jun  2 16:38:36 2024]  ? iommu_map_sg+0x113/0x1b0
[Sun Jun  2 16:38:36 2024]  ext4_writepages+0x87/0x120
[Sun Jun  2 16:38:36 2024]  do_writepages+0xac/0x160
[Sun Jun  2 16:38:36 2024]  ? wb_calc_thresh+0x41/0x50
[Sun Jun  2 16:38:36 2024]  __writeback_single_inode+0x39/0x2b0
[Sun Jun  2 16:38:36 2024]  writeback_sb_inodes+0x1ab/0x430
[Sun Jun  2 16:38:36 2024]  __writeback_inodes_wb+0x4c/0xe0
[Sun Jun  2 16:38:36 2024]  wb_writeback+0x1fd/0x250
[Sun Jun  2 16:38:36 2024]  wb_workfn+0x23d/0x400
[Sun Jun  2 16:38:36 2024]  process_one_work+0x13f/0x2c0
[Sun Jun  2 16:38:36 2024]  worker_thread+0x26d/0x390
[Sun Jun  2 16:38:36 2024]  ? rescuer_thread+0x380/0x380
[Sun Jun  2 16:38:36 2024]  kthread+0xc3/0xf0
[Sun Jun  2 16:38:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:38:36 2024]  ret_from_fork+0x2d/0x50
[Sun Jun  2 16:38:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:38:36 2024]  ret_from_fork_asm+0x11/0x20
[Sun Jun  2 16:38:36 2024]  </TASK>
[Sun Jun  2 16:40:36 2024] INFO: task md0_raid5:1545 blocked for more
than 241 seconds.
[Sun Jun  2 16:40:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:40:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:40:36 2024] task:md0_raid5       state:D stack:0
pid:1545  tgid:1545  ppid:2      flags:0x00004000
[Sun Jun  2 16:40:36 2024] Call Trace:
[Sun Jun  2 16:40:36 2024]  <TASK>
[Sun Jun  2 16:40:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:40:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:40:36 2024]  raid5d+0x466/0x670
[Sun Jun  2 16:40:36 2024]  ? __schedule+0x372/0x9a0
[Sun Jun  2 16:40:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:40:36 2024]  ? super_90_load.part.0+0x340/0x340
[Sun Jun  2 16:40:36 2024]  md_thread+0x94/0x120
[Sun Jun  2 16:40:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:40:36 2024]  kthread+0xc3/0xf0
[Sun Jun  2 16:40:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:40:36 2024]  ret_from_fork+0x2d/0x50
[Sun Jun  2 16:40:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:40:36 2024]  ret_from_fork_asm+0x11/0x20
[Sun Jun  2 16:40:36 2024]  </TASK>
[Sun Jun  2 16:40:36 2024] INFO: task jbd2/md0p1-8:1963 blocked for
more than 241 seconds.
[Sun Jun  2 16:40:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:40:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:40:36 2024] task:jbd2/md0p1-8    state:D stack:0
pid:1963  tgid:1963  ppid:2      flags:0x00004000
[Sun Jun  2 16:40:36 2024] Call Trace:
[Sun Jun  2 16:40:36 2024]  <TASK>
[Sun Jun  2 16:40:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:40:36 2024]  ? __queue_work+0x19f/0x3b0
[Sun Jun  2 16:40:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:40:36 2024]  md_write_start+0x10d/0x200
[Sun Jun  2 16:40:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:40:36 2024]  raid5_make_request+0x8f/0x1220
[Sun Jun  2 16:40:36 2024]  ? xas_store+0x2de/0x610
[Sun Jun  2 16:40:36 2024]  ? xas_load+0x2c/0x40
[Sun Jun  2 16:40:36 2024]  ? sched_feat_write+0x150/0x150
[Sun Jun  2 16:40:36 2024]  ? alloc_buffer_head+0x1a/0x60
[Sun Jun  2 16:40:36 2024]  ? filemap_get_entry+0x53/0x100
[Sun Jun  2 16:40:36 2024]  md_handle_request+0x132/0x210
[Sun Jun  2 16:40:36 2024]  ? __bio_split_to_limits+0x8c/0x250
[Sun Jun  2 16:40:36 2024]  ? bio_split_to_limits+0x3d/0x60
[Sun Jun  2 16:40:36 2024]  __submit_bio+0x46/0xd0
[Sun Jun  2 16:40:36 2024]  submit_bio_noacct_nocheck+0x11b/0x330
[Sun Jun  2 16:40:36 2024]  jbd2_journal_commit_transaction+0xc7a/0x1930
[Sun Jun  2 16:40:36 2024]  ? __schedule+0x372/0x9a0
[Sun Jun  2 16:40:36 2024]  kjournald2+0x95/0x220
[Sun Jun  2 16:40:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:40:36 2024]  ? jbd2_fc_wait_bufs+0x90/0x90
[Sun Jun  2 16:40:36 2024]  kthread+0xc3/0xf0
[Sun Jun  2 16:40:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:40:36 2024]  ret_from_fork+0x2d/0x50
[Sun Jun  2 16:40:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:40:36 2024]  ret_from_fork_asm+0x11/0x20
[Sun Jun  2 16:40:36 2024]  </TASK>
[Sun Jun  2 16:40:36 2024] INFO: task FileWrServerEve:2020854 blocked
for more than 241 seconds.
[Sun Jun  2 16:40:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:40:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:40:36 2024] task:FileWrServerEve state:D stack:0
pid:2020854 tgid:2020739 ppid:1      flags:0x00000002
[Sun Jun  2 16:40:36 2024] Call Trace:
[Sun Jun  2 16:40:36 2024]  <TASK>
[Sun Jun  2 16:40:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:40:36 2024]  ? ext4_getblk+0xb7/0x2c0
[Sun Jun  2 16:40:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:40:36 2024]  io_schedule+0x42/0x70
[Sun Jun  2 16:40:36 2024]  bit_wait_io+0xd/0x60
[Sun Jun  2 16:40:36 2024]  __wait_on_bit+0x48/0x140
[Sun Jun  2 16:40:36 2024]  ? bit_wait+0x60/0x60
[Sun Jun  2 16:40:36 2024]  out_of_line_wait_on_bit+0x81/0x90
[Sun Jun  2 16:40:36 2024]  ? pick_next_task_stop+0x70/0x70
[Sun Jun  2 16:40:36 2024]  do_get_write_access+0x235/0x3a0
[Sun Jun  2 16:40:36 2024]  jbd2_journal_get_write_access+0x88/0xb0
[Sun Jun  2 16:40:36 2024]  __ext4_journal_get_write_access+0x3e/0x160
[Sun Jun  2 16:40:36 2024]  __ext4_new_inode+0x5a3/0x1730
[Sun Jun  2 16:40:36 2024]  ext4_create+0xe9/0x1a0
[Sun Jun  2 16:40:36 2024]  path_openat+0xdca/0x1080
[Sun Jun  2 16:40:36 2024]  do_filp_open+0xa1/0x130
[Sun Jun  2 16:40:36 2024]  do_sys_openat2+0x74/0xa0
[Sun Jun  2 16:40:36 2024]  __x64_sys_openat+0x5c/0x70
[Sun Jun  2 16:40:36 2024]  do_syscall_64+0x3a/0xc0
[Sun Jun  2 16:40:36 2024]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[Sun Jun  2 16:40:36 2024] RIP: 0033:0x7f6b89289f80
[Sun Jun  2 16:40:36 2024] RSP: 002b:00007f6b5962d540 EFLAGS: 00000293
ORIG_RAX: 0000000000000101
[Sun Jun  2 16:40:36 2024] RAX: ffffffffffffffda RBX: 00000000000000c1
RCX: 00007f6b89289f80
[Sun Jun  2 16:40:36 2024] RDX: 00000000000000c1 RSI: 00007f6a6c5d1ee0
RDI: 00000000ffffff9c
[Sun Jun  2 16:40:36 2024] RBP: 00007f6a6c5d1ee0 R08: 0000000000000000
R09: 00000000fb82ea15
[Sun Jun  2 16:40:36 2024] R10: 00000000000081a4 R11: 0000000000000293
R12: 00000000000081a4
[Sun Jun  2 16:40:36 2024] R13: 00000000000000c1 R14: 00007f6a6c5d1ee0
R15: 00007f6b80ec9080
[Sun Jun  2 16:40:36 2024]  </TASK>
[Sun Jun  2 16:40:36 2024] INFO: task md0_resync:716263 blocked for
more than 241 seconds.
[Sun Jun  2 16:40:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:40:36 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:40:36 2024] task:md0_resync      state:D stack:0
pid:716263 tgid:716263 ppid:2      flags:0x00004000
[Sun Jun  2 16:40:36 2024] Call Trace:
[Sun Jun  2 16:40:36 2024]  <TASK>
[Sun Jun  2 16:40:36 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:40:36 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:40:36 2024]  raid5_get_active_stripe+0x1f9/0x4c0
[Sun Jun  2 16:40:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:40:36 2024]  raid5_sync_request+0x34f/0x370
[Sun Jun  2 16:40:36 2024]  ? is_mddev_idle+0xe7/0x150
[Sun Jun  2 16:40:36 2024]  md_do_sync+0x668/0xfa0
[Sun Jun  2 16:40:36 2024]  ? cpuacct_stats_show+0x120/0x120
[Sun Jun  2 16:40:36 2024]  ? super_90_load.part.0+0x340/0x340
[Sun Jun  2 16:40:36 2024]  md_thread+0x94/0x120
[Sun Jun  2 16:40:36 2024]  kthread+0xc3/0xf0
[Sun Jun  2 16:40:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:40:36 2024]  ret_from_fork+0x2d/0x50
[Sun Jun  2 16:40:36 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:40:36 2024]  ret_from_fork_asm+0x11/0x20
[Sun Jun  2 16:40:36 2024]  </TASK>
[Sun Jun  2 16:40:36 2024] INFO: task kworker/u385:4:1561579 blocked
for more than 241 seconds.
[Sun Jun  2 16:40:36 2024]       Not tainted 6.8.7-custom #1
[Sun Jun  2 16:40:37 2024] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Sun Jun  2 16:40:37 2024] task:kworker/u385:4  state:D stack:0
pid:1561579 tgid:1561579 ppid:2      flags:0x00004000
[Sun Jun  2 16:40:37 2024] Workqueue: writeback wb_workfn (flush-9:0)
[Sun Jun  2 16:40:37 2024] Call Trace:
[Sun Jun  2 16:40:37 2024]  <TASK>
[Sun Jun  2 16:40:37 2024]  __schedule+0x36a/0x9a0
[Sun Jun  2 16:40:37 2024]  schedule+0x27/0xc0
[Sun Jun  2 16:40:37 2024]  io_schedule+0x42/0x70
[Sun Jun  2 16:40:37 2024]  bit_wait_io+0xd/0x60
[Sun Jun  2 16:40:37 2024]  __wait_on_bit+0x48/0x140
[Sun Jun  2 16:40:37 2024]  ? bit_wait+0x60/0x60
[Sun Jun  2 16:40:37 2024]  out_of_line_wait_on_bit+0x81/0x90
[Sun Jun  2 16:40:37 2024]  ? pick_next_task_stop+0x70/0x70
[Sun Jun  2 16:40:37 2024]  do_get_write_access+0x235/0x3a0
[Sun Jun  2 16:40:37 2024]  jbd2_journal_get_write_access+0x88/0xb0
[Sun Jun  2 16:40:37 2024]  __ext4_journal_get_write_access+0x3e/0x160
[Sun Jun  2 16:40:37 2024]  ext4_mb_mark_context+0x93/0x390
[Sun Jun  2 16:40:37 2024]  ext4_mb_mark_diskspace_used+0xba/0x190
[Sun Jun  2 16:40:37 2024]  ext4_mb_new_blocks+0x13f/0xdc0
[Sun Jun  2 16:40:37 2024]  ? ext4_find_extent+0x330/0x420
[Sun Jun  2 16:40:37 2024]  ? ext4_find_extent+0x3bf/0x420
[Sun Jun  2 16:40:37 2024]  ? release_pages+0x141/0x3b0
[Sun Jun  2 16:40:37 2024]  ext4_ext_map_blocks+0x35b/0x1790
[Sun Jun  2 16:40:37 2024]  ? release_pages+0x141/0x3b0
[Sun Jun  2 16:40:37 2024]  ? filemap_get_folios_tag+0x6c/0x1b0
[Sun Jun  2 16:40:37 2024]  ? mpage_prepare_extent_to_map+0x439/0x470
[Sun Jun  2 16:40:37 2024]  ext4_map_blocks+0x164/0x5d0
[Sun Jun  2 16:40:37 2024]  ext4_do_writepages+0x694/0xba0
[Sun Jun  2 16:40:37 2024]  ? iommu_map_sg+0x113/0x1b0
[Sun Jun  2 16:40:37 2024]  ext4_writepages+0x87/0x120
[Sun Jun  2 16:40:37 2024]  do_writepages+0xac/0x160
[Sun Jun  2 16:40:37 2024]  ? wb_calc_thresh+0x41/0x50
[Sun Jun  2 16:40:37 2024]  __writeback_single_inode+0x39/0x2b0
[Sun Jun  2 16:40:37 2024]  writeback_sb_inodes+0x1ab/0x430
[Sun Jun  2 16:40:37 2024]  __writeback_inodes_wb+0x4c/0xe0
[Sun Jun  2 16:40:37 2024]  wb_writeback+0x1fd/0x250
[Sun Jun  2 16:40:37 2024]  wb_workfn+0x23d/0x400
[Sun Jun  2 16:40:37 2024]  process_one_work+0x13f/0x2c0
[Sun Jun  2 16:40:37 2024]  worker_thread+0x26d/0x390
[Sun Jun  2 16:40:37 2024]  ? rescuer_thread+0x380/0x380
[Sun Jun  2 16:40:37 2024]  kthread+0xc3/0xf0
[Sun Jun  2 16:40:37 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:40:37 2024]  ret_from_fork+0x2d/0x50
[Sun Jun  2 16:40:37 2024]  ? kthread_complete_and_exit+0x20/0x20
[Sun Jun  2 16:40:37 2024]  ret_from_fork_asm+0x11/0x20
[Sun Jun  2 16:40:37 2024]  </TASK>
[Sun Jun  2 16:40:37 2024] Future hung task reports are suppressed,
see sysctl kernel.hung_task_warnings

Anyone got insights at what happened here? The write workload from our
app is minor during Sundays, maybe several files per minute and
several megabytes in size.

This looks like reported problem:

https://lore.kernel.org/all/3bb2549f-846a-4179-9e23-407235a06753@xxxxxxxx/
https://lore.kernel.org/all/20240123005700.9302-1-dan@xxxxxxxx/

And the fix patch is(merged in v6.10-rc1):

https://lore.kernel.org/all/20240322081005.1112401-1-yukuai1@xxxxxxxxxxxxxxx/

Thanks,
Kuai


Best regards,
Darius Ski.

P.S. Running custom build of 6.8.7 and from quick glance in mailing
list did not see anything similar, sorry if it's dublicate and/or
solved already.

.






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux