Re: Re: [PATCH] md: Add two chances to update sync/recovery checkpoint

"Jianpeng Ma" <majianpeng@xxxxxxxxx> · Thu, 20 Sep 2012 14:14:30 +0800



On 2012-09-20 11:36 NeilBrown <neilb@xxxxxxx> Wrote:
>On Sat, 15 Sep 2012 16:59:34 +0800 "Jianpeng Ma" <majianpeng@xxxxxxxxx> wrote:
>
>> According commit 97e4f42d62badb0f9fbc27c013e89,it has 16 times to update
>> checkpoint of sync/recovery in func md_do_sync().
>> Because the the size of HDD became larger,the time of sync/recovery may
>> taken long times.So the 1/16 of time maybe half hour or more.
>> So it should add chance to update checkpoint.
>> There are places which can update checkpoint in md_do_sync.
>> 1: If call cond_resched and really sched
>> 2: If curr_speed is larger than max_sync_spedd
>> If above conditions are ok, we can try to update checkpoint.
>> 
>> Signed-off-by: Jianpeng Ma <majianpeng@xxxxxxxxx>
>> ---
>>  drivers/md/md.c |   16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 3f6203a..c7993d6 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -7496,7 +7496,14 @@ void md_do_sync(struct mddev *mddev)
>>  		 * about not overloading the IO subsystem. (things like an
>>  		 * e2fsck being done on the RAID array should execute fast)
>>  		 */
>> -		cond_resched();
>> +		if (cond_resched())
>> +			if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
>> +					mddev->curr_resync_completed != j &&
>> +					atomic_read(&mddev->recovery_active) == 0) {
>> +				mddev->curr_resync_completed = j;
>> +				set_bit(MD_CHANGE_CLEAN, &mddev->flags);
>> +				sysfs_notify(&mddev->kobj, NULL, "sync_completed");
>> +			}
>>  
>>  		currspeed = ((unsigned long)(io_sectors-mddev->resync_mark_cnt))/2
>>  			/((jiffies-mddev->resync_mark)/HZ +1) +1;
>> @@ -7505,6 +7512,13 @@ void md_do_sync(struct mddev *mddev)
>>  			if ((currspeed > speed_max(mddev)) ||
>>  					!is_mddev_idle(mddev, 0)) {
>>  				msleep(500);
>> +				if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
>> +					mddev->curr_resync_completed != j &&
>> +					atomic_read(&mddev->recovery_active) == 0) {
>> +					mddev->curr_resync_completed = j;
>> +					set_bit(MD_CHANGE_CLEAN, &mddev->flags);
>> +					sysfs_notify(&mddev->kobj, NULL, "sync_completed");
>> +				}
>>  				goto repeat;
>>  			}
>>  		}
>
>I don't really like this.  These two conditions seems rather arbitrary.
>If we want to do a checkpoint more often, we should use some time based test
>to do it.
>
>What results do you get with this change?  How often does a checkpoint happen
>on a busy system?  How often on an idle system?
My though is if cond_resched or msleep returned and atomic_read(&mddev->recovery_active) == 0,
we can change recovery_up and dosen't wait mddev->recovery_active==0.
There are many place to check recovery_cp, so as possible as to update recovery_cp may be good.
>
>A time-based update could be done in user-space.  Just write 'idle' to
>'sync_action' and it should do a checkpoint, then immediately restart from
>where it left off.
>
>NeilBrown
>?韬{.n?????%??檩??w?{.n???{炳盯w???塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f