On 2012-09-20 11:36 NeilBrown <neilb@xxxxxxx> Wrote: >On Sat, 15 Sep 2012 16:59:34 +0800 "Jianpeng Ma" <majianpeng@xxxxxxxxx> wrote: > >> According commit 97e4f42d62badb0f9fbc27c013e89,it has 16 times to update >> checkpoint of sync/recovery in func md_do_sync(). >> Because the the size of HDD became larger,the time of sync/recovery may >> taken long times.So the 1/16 of time maybe half hour or more. >> So it should add chance to update checkpoint. >> There are places which can update checkpoint in md_do_sync. >> 1: If call cond_resched and really sched >> 2: If curr_speed is larger than max_sync_spedd >> If above conditions are ok, we can try to update checkpoint. >> >> Signed-off-by: Jianpeng Ma <majianpeng@xxxxxxxxx> >> --- >> drivers/md/md.c | 16 +++++++++++++++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/md/md.c b/drivers/md/md.c >> index 3f6203a..c7993d6 100644 >> --- a/drivers/md/md.c >> +++ b/drivers/md/md.c >> @@ -7496,7 +7496,14 @@ void md_do_sync(struct mddev *mddev) >> * about not overloading the IO subsystem. (things like an >> * e2fsck being done on the RAID array should execute fast) >> */ >> - cond_resched(); >> + if (cond_resched()) >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && >> + mddev->curr_resync_completed != j && >> + atomic_read(&mddev->recovery_active) == 0) { >> + mddev->curr_resync_completed = j; >> + set_bit(MD_CHANGE_CLEAN, &mddev->flags); >> + sysfs_notify(&mddev->kobj, NULL, "sync_completed"); >> + } >> >> currspeed = ((unsigned long)(io_sectors-mddev->resync_mark_cnt))/2 >> /((jiffies-mddev->resync_mark)/HZ +1) +1; >> @@ -7505,6 +7512,13 @@ void md_do_sync(struct mddev *mddev) >> if ((currspeed > speed_max(mddev)) || >> !is_mddev_idle(mddev, 0)) { >> msleep(500); >> + if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && >> + mddev->curr_resync_completed != j && >> + atomic_read(&mddev->recovery_active) == 0) { >> + mddev->curr_resync_completed = j; >> + set_bit(MD_CHANGE_CLEAN, &mddev->flags); >> + sysfs_notify(&mddev->kobj, NULL, "sync_completed"); >> + } >> goto repeat; >> } >> } > >I don't really like this. These two conditions seems rather arbitrary. >If we want to do a checkpoint more often, we should use some time based test >to do it. > >What results do you get with this change? How often does a checkpoint happen >on a busy system? How often on an idle system? My though is if cond_resched or msleep returned and atomic_read(&mddev->recovery_active) == 0, we can change recovery_up and dosen't wait mddev->recovery_active==0. There are many place to check recovery_cp, so as possible as to update recovery_cp may be good. > >A time-based update could be done in user-space. Just write 'idle' to >'sync_action' and it should do a checkpoint, then immediately restart from >where it left off. > >NeilBrown >?韬{.n?????%??檩??w?{.n???{炳盯w???塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f