Hi Guoqing, On Fri, 17 Dec 2021 10:26:27 +0800 Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote: > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > > index 1240a5c16af8..8b5561811431 100644 > > --- a/drivers/md/raid5.c > > +++ b/drivers/md/raid5.c > > @@ -690,6 +690,9 @@ static int has_failed(struct r5conf *conf) > > { > > int degraded; > > > > + if (test_bit(MD_BROKEN, &conf->mddev->flags)) > > + return 1; > > + > > if (conf->mddev->reshape_position == MaxSector) > > return conf->mddev->degraded > conf->max_degraded; > > > > @@ -2877,34 +2880,29 @@ static void raid5_error(struct mddev > > *mddev, struct md_rdev *rdev) unsigned long flags; > > pr_debug("raid456: error called\n"); > > > > - spin_lock_irqsave(&conf->device_lock, flags); > > - > > - if (test_bit(In_sync, &rdev->flags) && > > - mddev->degraded == conf->max_degraded) { > > - /* > > - * Don't allow to achieve failed state > > - * Don't try to recover this device > > - */ > > - conf->recovery_disabled = mddev->recovery_disabled; > > - spin_unlock_irqrestore(&conf->device_lock, flags); > > - return; > > - } > > + pr_crit("md/raid:%s: Disk failure on %s, disabling > > device.\n", > > + mdname(mddev), bdevname(rdev->bdev, b)); > > > > + spin_lock_irqsave(&conf->device_lock, flags); > > set_bit(Faulty, &rdev->flags); > > clear_bit(In_sync, &rdev->flags); > > mddev->degraded = raid5_calc_degraded(conf); > > + > > + if (has_failed(conf)) { > > + set_bit(MD_BROKEN, &mddev->flags); > > What about other callers of has_failed? Do they need to set BROKEN > flag? Or set the flag in has_failed if it returns true, just FYI. > The function checks rdev->state for faulty. There are two, places where it can be set: - raid5_error (handled here) - raid5_spare_active (not a case IMO). I left it in raid5_error to be consistent with other levels. I think that moving it t has_failed is safe but I don't see any additional value in it. I see that in raid5_error we hold device_lock. It is not true for all has_failed calls. Do you have any recommendations? Thanks, Mariusz