Re: The read data is wrong from raid5 when recovery happens

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 29, 2023 at 10:27 AM Guoqing Jiang <guoqing.jiang@xxxxxxxxx> wrote:
>
>
>
> On 5/26/23 15:23, Xiao Ni wrote:
> >
> >>>>> 6. mdadm /dev/md126 --add /dev/sdd
> >>>>> 7. create 31 processes that writes and reads. It compares the content with
> >>>>> md5sum. The test will go on until the recovery stops
> >> Could you share the test code/script for step 7? Will try it from my side.
> > The test scripts are written by people from intel.
> > Hi, Mariusz. Can I share the test scripts here?
> >
> >>>>> 8. wait for about 10 minutes, we can see some processes report checksum is
> >>>>> wrong. But if it re-read the data again, the checksum will be good.
> >> So it is interim, I guess it appeared before recover was finished.
> > Yes, it appears before recovery finishes. The test will finish once
> > the recovery finishes.
> >
> >>>>> I tried to narrow this problem like this:
> >>>>>
> >>>>> -       md_account_bio(mddev, &bi);
> >>>>> +       if (rw == WRITE)
> >>>>> +               md_account_bio(mddev, &bi);
> >>>>> If it only do account for write requests, the problem can disappear.
> >>>>>
> >>>>> -       if (rw == READ && mddev->degraded == 0 &&
> >>>>> -           mddev->reshape_position == MaxSector) {
> >>>>> -               bi = chunk_aligned_read(mddev, bi);
> >>>>> -               if (!bi)
> >>>>> -                       return true;
> >>>>> -       }
> >>>>> +       //if (rw == READ && mddev->degraded == 0 &&
> >>>>> +       //    mddev->reshape_position == MaxSector) {
> >>>>> +       //      bi = chunk_aligned_read(mddev, bi);
> >>>>> +       //      if (!bi)
> >>>>> +       //              return true;
> >>>>> +       //}
> >>>>>
> >>>>>            if (unlikely(bio_op(bi) == REQ_OP_DISCARD)) {
> >>>>>                    make_discard_request(mddev, bi);
> >>>>> @@ -6180,7 +6180,8 @@ static bool raid5_make_request(struct mddev *mddev,
> >>>>> struct bio * bi)
> >>>>>                            md_write_end(mddev);
> >>>>>                    return true;
> >>>>>            }
> >>>>> -       md_account_bio(mddev, &bi);
> >>>>> +       if (rw == READ)
> >>>>> +               md_account_bio(mddev, &bi);
> >>>>>
> >>>>> I comment the chunk_aligned_read out and only account for read requests,
> >>>>> this problem can be reproduced.
>
> Only write bio and non aligned chunk read bio call md_account_bio, and
> only account
> write bio is fine per your test. It means the md5sum didn't match
> because of non aligned
> chunk read bio, so it is not abnormal that data in another chunk could
> be changed with
> the recovery is not finished, right?

That's right, only non aligned read requests can cause this problem.
Good catch. If I understand right, you mean the non aligned read
request reads data from the chunk which hasn't been recovered, right?

>
> BTW, I had run the test with bio accounting disabled by default, and
> seems the result is
> same.
>
> > git tag  --sort=taggerdate --contain 10764815f |head -1
> v5.14-rc1
>
> localhost:~/readdata #uname -r
> 5.15.0-rc4-59.24-default
> localhost:~/readdata #cat /sys/block/md126/queue/iostats
> 0
>
> And I can still see relevant log from the terminal which runs 01-test.sh

Hmm, thanks for this. I'll have a try again. Which kind of disks do
you use for testing?

Regards
Xiao
>
> file /tmp/pythontest/data.out.nhoBR6 Test failed
> (org_checksum|checksum_to_match) [582df4ccea6f5851462379fe4b17abf6
>   -|d41d8cd98f00b204e9800998ec
> f8427e  -]
> file /tmp/pythontest/data.out.nhoBR6 calculate checksum again:
> d41d8cd98f00b204e9800998ecf8427e  -
>
> file /tmp/pythontest/data.out.srGqF9 Test failed
> (org_checksum|checksum_to_match) [7a85ec35b171e52ce41ebdd5da86f1d9
>   -|d41d8cd98f00b204e9800998ec
> f8427e  -]
> file /tmp/pythontest/data.out.srGqF9 calculate checksum again:
> d41d8cd98f00b204e9800998ecf8427e  -
> ...
>
> file /tmp/pythontest/data.out.q6m1kK Test failed
> (org_checksum|checksum_to_match) [996f5bb5d2a2ddfd378a80a563ab4ad5
>   -|d41d8cd98f00b204e9800998ec
> f8427e  -]
> file /tmp/pythontest/data.out.q6m1kK calculate checksum again:
> d41d8cd98f00b204e9800998ecf8427e  -
>
> Rebuild ended, testing done!
> stresstest end at : Sun May 28 22:23:53 EDT 2023
> Sun May 28 22:23:53 EDT 2023
>
> Thanks,
> Guoqing
>





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux