Re: [REGRESSION] Data read from a degraded RAID 4/5/6 array could be silently corrupted.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2023/11/18 8:32, Song Liu 写道:
On Thu, Nov 16, 2023 at 5:05 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

Hi,

在 2023/11/17 0:24, Song Liu 写道:
+ more folks.

On Fri, Nov 10, 2023 at 7:00 PM Bhanu Victor DiCara
<00bvd0+linux@xxxxxxxxx> wrote:

A degraded RAID 4/5/6 array can sometimes read 0s instead of the actual data.


#regzbot introduced: 10764815ff4728d2c57da677cd5d3dd6f446cf5f
(The problem does not occur in the previous commit.)

In commit 10764815ff4728d2c57da677cd5d3dd6f446cf5f, file drivers/md/raid5.c, line 5808, there is `md_account_bio(mddev, &bi);`. When this line (and the previous line) is removed, the problem does not occur.

The patch below should fix it. Please give it more thorough tests and
let me know whether it fixes everything. I will send patch later with
more details.

Thanks,
Song

diff --git i/drivers/md/md.c w/drivers/md/md.c
index 68f3bb6e89cb..d4fb1aa5c86f 100644
--- i/drivers/md/md.c
+++ w/drivers/md/md.c
@@ -8674,7 +8674,8 @@ static void md_end_clone_io(struct bio *bio)
          struct bio *orig_bio = md_io_clone->orig_bio;
          struct mddev *mddev = md_io_clone->mddev;

-       orig_bio->bi_status = bio->bi_status;
+       if (bio->bi_status)
+               orig_bio->bi_status = bio->bi_status;

I'm confused, do you mean that orig_bio can have error while bio
doesn't? If this is the case, can you explain more how this is
possible?

Yes, this is possible.

Basically, a big bio is split by md_submit_bio => bio_split_to_limits, so
we will have two bio's into md_clone_bio(). Let's call them
parent_orig_bio and split_orig_bio. Errors from split_orig_bio will be
propagated to parent_orig_bio by __bio_chain_endio(). If
parent_orig_bio succeeded, md_end_clone_io may overwrite the error
reported by split_orig_bio. Does this make sense?

Yes, this is a good catch, there should be similiar conditions from
__bio_chain_endio().

Thanks,
Kuai


Thanks,
Song
.





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux