On Thu, Nov 16, 2023 at 5:05 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2023/11/17 0:24, Song Liu 写道: > > + more folks. > > > > On Fri, Nov 10, 2023 at 7:00 PM Bhanu Victor DiCara > > <00bvd0+linux@xxxxxxxxx> wrote: > >> > >> A degraded RAID 4/5/6 array can sometimes read 0s instead of the actual data. > >> > >> > >> #regzbot introduced: 10764815ff4728d2c57da677cd5d3dd6f446cf5f > >> (The problem does not occur in the previous commit.) > >> > >> In commit 10764815ff4728d2c57da677cd5d3dd6f446cf5f, file drivers/md/raid5.c, line 5808, there is `md_account_bio(mddev, &bi);`. When this line (and the previous line) is removed, the problem does not occur. > > > > The patch below should fix it. Please give it more thorough tests and > > let me know whether it fixes everything. I will send patch later with > > more details. > > > > Thanks, > > Song > > > > diff --git i/drivers/md/md.c w/drivers/md/md.c > > index 68f3bb6e89cb..d4fb1aa5c86f 100644 > > --- i/drivers/md/md.c > > +++ w/drivers/md/md.c > > @@ -8674,7 +8674,8 @@ static void md_end_clone_io(struct bio *bio) > > struct bio *orig_bio = md_io_clone->orig_bio; > > struct mddev *mddev = md_io_clone->mddev; > > > > - orig_bio->bi_status = bio->bi_status; > > + if (bio->bi_status) > > + orig_bio->bi_status = bio->bi_status; > > I'm confused, do you mean that orig_bio can have error while bio > doesn't? If this is the case, can you explain more how this is > possible? Yes, this is possible. Basically, a big bio is split by md_submit_bio => bio_split_to_limits, so we will have two bio's into md_clone_bio(). Let's call them parent_orig_bio and split_orig_bio. Errors from split_orig_bio will be propagated to parent_orig_bio by __bio_chain_endio(). If parent_orig_bio succeeded, md_end_clone_io may overwrite the error reported by split_orig_bio. Does this make sense? Thanks, Song