Re: [PATCH] md: only unlock mddev from action_store

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 6, 2022 at 5:55 PM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote:
>
>
>
> On 6/6/22 5:36 PM, Xiao Ni wrote:
> >>> @@ -4827,12 +4827,14 @@ action_store(struct mddev *mddev, const char
> >>> *page, size_t len)
> >>>                           clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
> >>>                   if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) &&
> >>>                       mddev_lock(mddev) == 0) {
> >>> +                       mddev_suspend(mddev);
> >>>                           if (work_pending(&mddev->del_work))
> >>>                                   flush_workqueue(md_misc_wq);
> >>>                           if (mddev->sync_thread) {
> >>>                                   set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> >>>                                   md_reap_sync_thread(mddev);
> >>>                           }
> >>> +                       mddev_resume(mddev);
> >>>                           mddev_unlock(mddev);
> >>>                   }
> >>>           } else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
> >> The action actually means sync action which are for internal IO instead
> >> of external IO, I suppose the semantic is different with above change.
> > The original problem should be i/o happen when interrupting the sync thread?
> > So the external i/o calls md_write_start and sets MD_SB_CHANGE_PENDING.
> > Then raid5d->md_check_recovery can't update superblock and handle internal
> > stripes. So the `echo idle` action is stuck.
>
> My point is action_store shouldn't disturb external IO, it should only
> deal with
> sync IO and relevant stuffs as the function is for sync_action node, no?

Yes, it's a change that was mentioned above. It depends on which way
we want to go.
In some sysfs file store functions, it also calls mddev_suspend. For
some situations,
it needs to choose if we need to stop the i/o temporarily.

Anyway, it's only a possible method I want to discuss. It's very good
for me to dig into this problem.
Now I have a better understanding of those codes :)

Best Regards
Xiao




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux