Re: md/raid5:Fix recover/replace stop if handle stipe failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 14 Mar 2012 15:07:55 +0800 "majianpeng" <majianpeng@xxxxxxxxx> wrote:

> >From 849df9f6422972452b99a2c2d08d005437a52d72 Mon Sep 17 00:00:00 2001
> From: majianpeng <majianpeng@xxxxxxxxx>
> Date: Wed, 14 Mar 2012 14:41:07 +0800
> Subject: [PATCH] md/raid5:Fix recover/replace stop if handle stipe failed. 
>  If handled stipe failed when recover/replace,should not first
>  call md_done_sync(conf->mddev, STRIPE_SECTORS, 0).Beacause
>  this set MD_RECOVERY_INTR and will terminate the
>  recover/replace. And the sync_thread will repeatly start
>  and stop.

I disagree.  It is safer to stop and then (if all seems to be working) to
start again.  We will start up exactly were we left of so there is little
cost, and I think it make the code safer.


> 
> 
> Signed-off-by: majianpeng <majianpeng@xxxxxxxxx>
> ---
>  drivers/md/raid5.c |    8 ++++++--
>  1 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 360f2b9..55193ef 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -2472,7 +2472,6 @@ handle_failed_sync(struct r5conf *conf, struct stripe_head *sh,
>  	int abort = 0;
>  	int i;
>  
> -	md_done_sync(conf->mddev, STRIPE_SECTORS, 0);
>  	clear_bit(STRIPE_SYNCING, &sh->state);
>  	s->syncing = 0;
>  	s->replacing = 0;
> @@ -2480,8 +2479,12 @@ handle_failed_sync(struct r5conf *conf, struct stripe_head *sh,
>  	 * For recover/replace we need to record a bad block on all
>  	 * non-sync devices, or abort the recovery
>  	 */
> -	if (!test_bit(MD_RECOVERY_RECOVER, &conf->mddev->recovery))
> +	if (!test_bit(MD_RECOVERY_RECOVER, &conf->mddev->recovery)) {
> +		md_done_sync(conf->mddev, STRIPE_SECTORS, 0);
>  		return;
> +	} else
> +		md_done_sync(conf->mddev, STRIPE_SECTORS, 1);
> +
>  	/* During recovery devices cannot be removed, so locking and
>  	 * refcounting of rdevs is not needed
>  	 */
> @@ -2504,6 +2507,7 @@ handle_failed_sync(struct r5conf *conf, struct stripe_head *sh,
>  	if (abort) {
>  		conf->recovery_disabled = conf->mddev->recovery_disabled;
>  		set_bit(MD_RECOVERY_INTR, &conf->mddev->recovery);
> +		md_wakeup_thread(conf->mddev->thread);

This change seems unrelated to the above changes.

It isn't needed as this function is called only by the thread that you are
waking up, so it cannot be asleep.

Thanks,
NeilBrown



>  	}
>  }
>  

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux