Thanks for the information! On Tue, Jan 23, 2024 at 3:58 PM Dan Moulding <dan@xxxxxxxx> wrote: > > > This appears the md thread hit some infinite loop, so I would like to > > know what it is doing. We can probably get the information with the > > perf tool, something like: > > > > perf record -a > > perf report > > Here you go! > > # Total Lost Samples: 0 > # > # Samples: 78K of event 'cycles' > # Event count (approx.): 83127675745 > # > # Overhead Command Shared Object Symbol > # ........ ............... .............................. ................................................... > # > 49.31% md0_raid5 [kernel.kallsyms] [k] handle_stripe > 18.63% md0_raid5 [kernel.kallsyms] [k] ops_run_io > 6.07% md0_raid5 [kernel.kallsyms] [k] handle_active_stripes.isra.0 > 5.50% md0_raid5 [kernel.kallsyms] [k] do_release_stripe > 3.09% md0_raid5 [kernel.kallsyms] [k] _raw_spin_lock_irqsave > 2.48% md0_raid5 [kernel.kallsyms] [k] r5l_write_stripe > 1.89% md0_raid5 [kernel.kallsyms] [k] md_wakeup_thread > 1.45% ksmd [kernel.kallsyms] [k] ksm_scan_thread > 1.37% md0_raid5 [kernel.kallsyms] [k] stripe_is_lowprio > 0.87% ksmd [kernel.kallsyms] [k] memcmp > 0.68% ksmd [kernel.kallsyms] [k] xxh64 > 0.56% md0_raid5 [kernel.kallsyms] [k] __wake_up_common > 0.52% md0_raid5 [kernel.kallsyms] [k] __wake_up > 0.46% ksmd [kernel.kallsyms] [k] mtree_load > 0.44% ksmd [kernel.kallsyms] [k] try_grab_page > 0.40% ksmd [kernel.kallsyms] [k] follow_p4d_mask.constprop.0 > 0.39% md0_raid5 [kernel.kallsyms] [k] r5l_log_disk_error > 0.37% md0_raid5 [kernel.kallsyms] [k] _raw_spin_lock_irq > 0.33% md0_raid5 [kernel.kallsyms] [k] release_stripe_list > 0.31% md0_raid5 [kernel.kallsyms] [k] release_inactive_stripe_list It appears the thread is indeed doing something. I haven't got luck to reproduce this on my hosts. Could you please try whether the following change fixes the issue (without reverting 0de40f76d567)? I will try to reproduce the issue on my side. Junxiao, Please also help look into this. Thanks, Song