Hi John, Hi Yu, > On 10. Aug 2024, at 00:51, John Stoffel <john@xxxxxxxxxxx> wrote: > >>>>>> "Christian" == Christian Theune <ct@xxxxxxxxxxxxxxx> writes: > >> Hi, >>> On 9. Aug 2024, at 03:13, Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: >>> >>> >>> Yes, for sure IO are stuck in md127 and never get dispatched to nvme, >>> for now I'll say this is a raid5 problem. > >> Note, that this is raid6, not raid5! Sorry, I never explicitly >> mentioned that and it was buried in the mdstat output. > > That's good info. > > I wonder if you could setup some loop devices, build a RAID6 array, > put XFS on it and try to replicate the problem by rsyncing a bunch of > files. I was about to try this, but I’m wondering what backing devices you had in mind here? If I place images for loop on the original (defective) RAID 6 setup then this wouldn’t give us much info. However, I could take the hot spare and run a sequence of tests against that, first with a newer and potentially with an older kernel if it doesn’t reproduce in its final form: - xfs directly on the nvme drive - xfs on encrypted nvme drive - xfs on raid 1 on nvme drive, split into two partitions - xfs on raid 5 on nvme drive, split into a few partitions - xfs on raid 6 on nvme drive, split into a few partitions - repeat the tests with raid1/5//6 with encrypted partitions As that will take some time and effort, I’d like to double check whether that sounds sensible to you as well? Cheers, Christian -- Christian Theune · ct@xxxxxxxxxxxxxxx · +49 345 219401 0 Flying Circus Internet Operations GmbH · https://flyingcircus.io Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick