On 02.03.24 01:05, Song Liu wrote: > On Fri, Mar 1, 2024 at 3:12 PM Dan Moulding <dan@xxxxxxxx> wrote: >> >>> 5. Looks like the block layer or underlying(scsi/virtio-scsi) may have >>> some issue which leading to the io request from md layer stayed in a >>> partial complete statue. I can't see how this can be related with the >>> commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in >>> raid5d"") >> >> There is no question that the above mentioned commit makes this >> problem appear. While it may be that ultimately the root cause lies >> outside the md/raid5 code (I'm not able to make such an assessment), I >> can tell you that change is what turned it into a runtime >> regression. Prior to that change, I cannot reproduce the problem. One >> of my RAID-5 arrays has been running on every kernel version since >> 4.8, without issue. Then kernel 6.7.1 the problem appeared within >> hours of running the new code and affected not just one but two >> different machines with RAID-5 arrays. With that change reverted, the >> problem is not reproducible. Then when I recently upgraded to 6.8-rc5 >> I immediately hit the problem again (because it hadn't been reverted >> in the mainline yet). I'm now running 6.8.0-rc5 on one of my affected >> machines without issue after reverting that commit on top of it. > [...] > I also tried again to reproduce the issue, but haven't got luck. While > I will continue try to repro the issue, I will also send the revert to 6.8 > kernel. Is that revert on the way meanwhile? I'm asking because Linus might release 6.8 on Sunday. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page.