On Fri, Dec 17, 2021 at 10:00 AM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote: > > > > On 12/16/21 10:52 PM, Mariusz Tkaczyk wrote: > > Patch 62f7b1989c0 ("md raid0/linear: Mark array as 'broken' and fail BIOs > > if a member is gone") allowed to finish writes earlier (before level > > dependent actions) for non-redundant arrays. > > > > To achieve that MD_BROKEN is added to mddev->flags if drive disappearance > > is detected. This is done in is_mddev_broken() which is confusing and not > > consistent with other levels where error_handler() is used. > > This patch adds appropriate error_handler for raid0 and linear. > > > > It also adopts md_error(), we only want to call .error_handler for those > > levels. mddev->pers->sync_request is additionally checked, its existence > > implies a level with redundancy. > > > > Usage of error_handler causes that disk failure can be requested from > > userspace. User can fail the array via #mdadm --set-faulty command. This > > is not safe and will be fixed in mdadm. It is correctable because failed > > state is not recorded in the metadata. After next assembly array will be > > read-write again. For safety reason is better to keep MD_BROKEN in > > runtime only. > > > > Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@xxxxxxxxxxxxxxx> > > --- > > drivers/md/md-linear.c | 15 ++++++++++++++- > > drivers/md/md.c | 6 +++++- > > drivers/md/md.h | 10 ++-------- > > drivers/md/raid0.c | 15 ++++++++++++++- > > 4 files changed, 35 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c > > index 1ff51647a682..415d2615d17a 100644 > > --- a/drivers/md/md-linear.c > > +++ b/drivers/md/md-linear.c > > @@ -233,7 +233,8 @@ static bool linear_make_request(struct mddev *mddev, struct bio *bio) > > bio_sector < start_sector)) > > goto out_of_bounds; > > > > - if (unlikely(is_mddev_broken(tmp_dev->rdev, "linear"))) { > > + if (unlikely(is_rdev_broken(tmp_dev->rdev))) { > > + md_error(mddev, tmp_dev->rdev); > > bio_io_error(bio); > > return true; > > } > > @@ -281,6 +282,17 @@ static void linear_status (struct seq_file *seq, struct mddev *mddev) > > seq_printf(seq, " %dk rounding", mddev->chunk_sectors / 2); > > } > > > > +static void linear_error(struct mddev *mddev, struct md_rdev *rdev) > > +{ > > + char b[BDEVNAME_SIZE]; > > + > > + if (!test_and_set_bit(MD_BROKEN, &rdev->mddev->flags)) > > + pr_crit("md/linear%s: Disk failure on %s detected.\n" > > + "md/linear:%s: Cannot continue, failing array.\n", > > + mdname(mddev), bdevname(rdev->bdev, b), > > + mdname(mddev)); > > +} > > + > > Do you consider to use %pg to print block device? > > > static void linear_quiesce(struct mddev *mddev, int state) > > { > > } > > @@ -297,6 +309,7 @@ static struct md_personality linear_personality = > > .hot_add_disk = linear_add, > > .size = linear_size, > > .quiesce = linear_quiesce, > > + .error_handler = linear_error, > > }; > > > > static int __init linear_init (void) > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index e8666bdc0d28..f888ef197765 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -7982,7 +7982,11 @@ void md_error(struct mddev *mddev, struct md_rdev *rdev) > > > > if (!mddev->pers || !mddev->pers->error_handler) > > return; > > - mddev->pers->error_handler(mddev,rdev); > > + mddev->pers->error_handler(mddev, rdev); > > + > > + if (!mddev->pers->sync_request) > > + return; > > + > > What is the reason of the above change? I suppose dm event can be missed. Hi Guoqing What's the dm event here? From my understanding, this patch only wants to set MD_BROKEN in error handler. Now raid0 and linear only need to do this job in md_error. It checks ->sync_request here and can return for raid0 and linear. In the error handler raid0 and linear already set MD_BROKEN. Best Regards Xiao