Hi all I can reproduce this quickly with the commands mentioned at https://www.spinics.net/lists/raid/msg73521.html in my environment. After several hours test, this problem hasn't happened. This patch works for me Tested-by: Xiao Ni <xni@xxxxxxxxxx> On Fri, Nov 17, 2023 at 12:25 AM Song Liu <song@xxxxxxxxxx> wrote: > > + more folks. > > On Fri, Nov 10, 2023 at 7:00 PM Bhanu Victor DiCara > <00bvd0+linux@xxxxxxxxx> wrote: > > > > A degraded RAID 4/5/6 array can sometimes read 0s instead of the actual data. > > > > > > #regzbot introduced: 10764815ff4728d2c57da677cd5d3dd6f446cf5f > > (The problem does not occur in the previous commit.) > > > > In commit 10764815ff4728d2c57da677cd5d3dd6f446cf5f, file drivers/md/raid5.c, line 5808, there is `md_account_bio(mddev, &bi);`. When this line (and the previous line) is removed, the problem does not occur. > > The patch below should fix it. Please give it more thorough tests and > let me know whether it fixes everything. I will send patch later with > more details. > > Thanks, > Song > > diff --git i/drivers/md/md.c w/drivers/md/md.c > index 68f3bb6e89cb..d4fb1aa5c86f 100644 > --- i/drivers/md/md.c > +++ w/drivers/md/md.c > @@ -8674,7 +8674,8 @@ static void md_end_clone_io(struct bio *bio) > struct bio *orig_bio = md_io_clone->orig_bio; > struct mddev *mddev = md_io_clone->mddev; > > - orig_bio->bi_status = bio->bi_status; > + if (bio->bi_status) > + orig_bio->bi_status = bio->bi_status; > > if (md_io_clone->start_time) > bio_end_io_acct(orig_bio, md_io_clone->start_time); > > > > > > Similarly, in commit ffc253263a1375a65fa6c9f62a893e9767fbebfa (v6.6), file drivers/md/raid5.c, when line 6200 is removed, the problem does not occur. > > > > > > Steps to reproduce the problem (using bash or similar): > > 1. Create a degraded RAID 4/5/6 array: > > fallocate -l 2056M test_array_part_1.img > > fallocate -l 2056M test_array_part_2.img > > lo1=$(losetup --sector-size 4096 --find --nooverlap --direct-io --show test_array_part_1.img) > > lo2=$(losetup --sector-size 4096 --find --nooverlap --direct-io --show test_array_part_2.img) > > # The RAID level must be 4 or 5 or 6 with at least 1 missing drive in any order. The following configuration seems to be the most effective: > > mdadm --create /dev/md/tmp_test_array --level=4 --raid-devices=3 --chunk=1M --size=2G $lo1 missing $lo2 > > > > 2. Create the test file system and clone it to the degraded array: > > fallocate -l 4G test_fs.img > > mke2fs -t ext4 -b 4096 -i 65536 -m 0 -E stride=256,stripe_width=512 -L test_fs test_fs.img > > lo3=$(losetup --sector-size 4096 --find --nooverlap --direct-io --show test_fs.img) > > mount $lo3 /mnt/1 > > python3 create_test_fs.py /mnt/1 > > umount /mnt/1 > > cat test_fs.img > /dev/md/tmp_test_array > > cmp -l test_fs.img /dev/md/tmp_test_array # Optionally verify the clone > > mount --read-only $lo3 /mnt/1 > > > > 3. Mount the degraded array: > > mount --read-only /dev/md/tmp_test_array /mnt/2 > > > > 4. Compare the files: > > diff -q /mnt/1 /mnt/2 > > > > If no files are detected as different, do `umount /mnt/2` and `echo 2 > /proc/sys/vm/drop_caches`, and then go to step 3. > > (Doing `echo 3 > /proc/sys/vm/drop_caches` and then going to step 4 is less effective.) > > (Only doing `umount /mnt/2` and/or `echo 1 > /proc/sys/vm/drop_caches` is much less effective and the effectiveness wears off.) > > > > > > create_test_fs.py: > > import errno > > import itertools > > import os > > import random > > import sys > > > > > > def main(test_fs_path): > > rng = random.Random(0) > > try: > > for i in itertools.count(): > > size = int(2**rng.uniform(12, 24)) > > with open(os.path.join(test_fs_path, str(i).zfill(4) + '.bin'), 'xb') as f: > > f.write(b'\xff' * size) > > print(f'Created file {f.name!r} with size {size}') > > except OSError as e: > > if e.errno != errno.ENOSPC: > > raise > > print(f'Done: {e.strerror} (partially created file {f.name!r})') > > > > > > if __name__ == '__main__': > > main(sys.argv[1]) > > > > > > >