If disk fails during resync, sync service of personality usually skips the rest of not synchronized stripes. It finishes sync thread (md_do_sync()) and wakes up the main raid thread. md_recovery_check() starts and unregisteres sync thread. In the meanwhile mdmon also services failed disk - removes and replaces it with a new one (if it was available). If checkpoint is stored (with value of array's max_sector), next md_recovery_check() will restart resync. It finishes normally and activates ALL spares (including the one added recently) what is wrong. Another md_recovery_check() will not start recovery as all disks are in sync. If checkpoint is not stored, second resync does not start and recovery can proceed. Signed-off-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@xxxxxxxxx> --- drivers/md/md.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 3e40aad..6eda858 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6929,7 +6929,8 @@ void md_do_sync(mddev_t *mddev) if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) && mddev->curr_resync > 2) { if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { - if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) { + if (test_bit(MD_RECOVERY_INTR, &mddev->recovery) && + mddev->curr_resync < max_sectors) { if (mddev->curr_resync >= mddev->recovery_cp) { printk(KERN_INFO "md: checkpointing %s of %s.\n", -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html