On Mon, Jun 6, 2022 at 5:55 PM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote: > > > > On 6/6/22 5:36 PM, Xiao Ni wrote: > >>> @@ -4827,12 +4827,14 @@ action_store(struct mddev *mddev, const char > >>> *page, size_t len) > >>> clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); > >>> if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) && > >>> mddev_lock(mddev) == 0) { > >>> + mddev_suspend(mddev); > >>> if (work_pending(&mddev->del_work)) > >>> flush_workqueue(md_misc_wq); > >>> if (mddev->sync_thread) { > >>> set_bit(MD_RECOVERY_INTR, &mddev->recovery); > >>> md_reap_sync_thread(mddev); > >>> } > >>> + mddev_resume(mddev); > >>> mddev_unlock(mddev); > >>> } > >>> } else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) > >> The action actually means sync action which are for internal IO instead > >> of external IO, I suppose the semantic is different with above change. > > The original problem should be i/o happen when interrupting the sync thread? > > So the external i/o calls md_write_start and sets MD_SB_CHANGE_PENDING. > > Then raid5d->md_check_recovery can't update superblock and handle internal > > stripes. So the `echo idle` action is stuck. > > My point is action_store shouldn't disturb external IO, it should only > deal with > sync IO and relevant stuffs as the function is for sync_action node, no? Yes, it's a change that was mentioned above. It depends on which way we want to go. In some sysfs file store functions, it also calls mddev_suspend. For some situations, it needs to choose if we need to stop the i/o temporarily. Anyway, it's only a possible method I want to discuss. It's very good for me to dig into this problem. Now I have a better understanding of those codes :) Best Regards Xiao