Re: Last working drive in RAID1

NeilBrown <neilb@xxxxxxx> · Thu, 5 Mar 2015 08:46:34 +1100

On Wed, 04 Mar 2015 12:55:43 -0700 Eric Mei <meijia@xxxxxxxxx> wrote:

> Hi,
> 
> It is interesting to notice that RAID1 won't mark the last working drive 
> as Faulty no matter what. The responsible code seems here:
> 
> static void error(struct mddev *mddev, struct md_rdev *rdev)
> {
>          ...
>          /*
>           * If it is not operational, then we have already marked it as dead
>           * else if it is the last working disks, ignore the error, let the
>           * next level up know.
>           * else mark the drive as failed
>           */
>          if (test_bit(In_sync, &rdev->flags)
>              && (conf->raid_disks - mddev->degraded) == 1) {
>                  /*
>                   * Don't fail the drive, act as though we were just a
>                   * normal single drive.
>                   * However don't try a recovery from this drive as
>                   * it is very likely to fail.
>                   */
>                  conf->recovery_disabled = mddev->recovery_disabled;
>                  return;
>          }
>          ...
> }
> 
> The end result is that even if all the drives are physically gone, there 
> still one drive remains in array forever, and mdadm continues to report 
> the array is degraded instead of failed. RAID10 also has similar behavior.
> 
> Is there any reason we absolutely don't want to fail the last drive of 
> RAID1?
> 

When a RAID1 only has one drive remaining, then it should act as much as
possible like a single plain ordinary drive.

How does /dev/sda behave when you physically remove the device?  md0 (as a
raid1 with one drive) should do the same.

NeilBrown
Attachment:
pgpNd0qORqWdx.pgp

Description: OpenPGP digital signature