Re: RAID5 losing initial synchronization on restart when one disk is spare

Neil Brown <neilb@xxxxxxx> · Thu, 12 Jun 2008 09:45:09 +1000

On Wednesday June 4, hubskml@xxxxxxx wrote:
> Hello
> 
> According to mdadm's man page:
> "When creating a RAID5 array, mdadm will automatically create a degraded
> array with an extra spare drive. This is because building the spare
> into a degraded array is in general faster than resyncing the parity on
> a non-degraded, but not clean, array. This feature can be over-ridden
> with the --force option."
> 
> Unfortunately, I'm seeing a kind of bug when I create a RAID5 array with 
> an internal bitmap, then stop the array before the initial 
> synchronization is done and restart the array.
> 
> 1° When I create the array with an internal bitmap:
> mdadm -C /dev/md_d1 -e 1.2 -l 5 -n 4 -b internal -R /dev/sd?
> I see the last disk as a spare disk. After the restart of the array, all 
> disks are seen active and the array is not continuing the aborted 
> synchronization!
> Note that I did not use the --assume-clean option.
> 
> 2° When I create the array without a bitmap:
> mdadm -C /dev/md_d1 -e 1.2 -l 5 -n 4 -R /dev/sd?
> I see the last disk as a spare disk. After the restart of the array, the 
> spare disk is still a spare disk and the array continues the 
> synchronization where it had stopped.
> 
> In the case 1°, is this a bug or did I miss something?

Thanks for the detailed report.  Yes, this is a bug.

The following patch fixes it, though I'm not 100% sure this is the
right fix (it may cause too much resync in some cases, which is better
than not enough, but not ideal).

> Secondly, what could be the consequences of this non-performed 
> synchronization ?

If you lose a drive, the data might get corrupted.

When writing to the array, the new parity block will sometimes be
calculated assuming that it was previously correct.  If all updates to
a particular parity block are of this sort, then it will still be
incorrect when you lose a drive, and data recovered based on that
parity block will be incorrect.

Until you lose a drive, it will have no visible effect.

NeilBrown


Signed-off-by: Neil Brown <neilb@xxxxxxx>

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c

--- .prev/drivers/md/raid5.c	2008-06-10 10:27:51.000000000 +1000
+++ ./drivers/md/raid5.c	2008-06-12 09:34:25.000000000 +1000
@@ -4094,7 +4094,9 @@ static int run(mddev_t *mddev)
 				" disk %d\n", bdevname(rdev->bdev,b),
 				raid_disk);
 			working_disks++;
-		}
+		} else
+			/* Cannot rely on bitmap to complete recovery */
+			conf->fullsync = 1;
 	}
 
 	/*
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html