Re: [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Thorsten,

On Wed, Mar 6, 2024 at 12:38 AM Linux regression tracking (Thorsten
Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
>
> On 02.03.24 01:05, Song Liu wrote:
> > On Fri, Mar 1, 2024 at 3:12 PM Dan Moulding <dan@xxxxxxxx> wrote:
> >>
> >>> 5. Looks like the block layer or underlying(scsi/virtio-scsi) may have
> >>> some issue which leading to the io request from md layer stayed in a
> >>> partial complete statue. I can't see how this can be related with the
> >>> commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in
> >>> raid5d"")
> >>
> >> There is no question that the above mentioned commit makes this
> >> problem appear. While it may be that ultimately the root cause lies
> >> outside the md/raid5 code (I'm not able to make such an assessment), I
> >> can tell you that change is what turned it into a runtime
> >> regression. Prior to that change, I cannot reproduce the problem. One
> >> of my RAID-5 arrays has been running on every kernel version since
> >> 4.8, without issue. Then kernel 6.7.1 the problem appeared within
> >> hours of running the new code and affected not just one but two
> >> different machines with RAID-5 arrays. With that change reverted, the
> >> problem is not reproducible. Then when I recently upgraded to 6.8-rc5
> >> I immediately hit the problem again (because it hadn't been reverted
> >> in the mainline yet). I'm now running 6.8.0-rc5 on one of my affected
> >> machines without issue after reverting that commit on top of it.
> > [...]
> > I also tried again to reproduce the issue, but haven't got luck. While
> > I will continue try to repro the issue, I will also send the revert to 6.8
> > kernel.
>
> Is that revert on the way meanwhile? I'm asking because Linus might
> release 6.8 on Sunday.

The patch is on its way to 6.9 kernel via a PR yesterday [1]. It will land in
stable 6.8 kernel via stable backports.

Since this is not a new regression in 6.8 kernel and Dan is the only one
experiencing this, we would rather not rush last minute change to the 6.8
release.

Thanks,
Song

[1] https://lore.kernel.org/linux-raid/1C22EE73-62D9-43B0-B1A2-2D3B95F774AC@xxxxxx/





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux