Re: raid5 Journal Recovery Bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 19, 2022 at 3:52 PM Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote:
>
> Hi Song,
>
> I'm wondering if you can help shed some light on a bug I'm trying to
> track down.
>
> We're hitting the BUG_ON in handle_parity_checks5() that tests to ensure
> R5_UPTODATE is set for a failed disk in a stripe[1].
>
> We hit this in our test suite somewhat rarely when the journal is
> enabled doing device removal and recovery tests. We've concocted a test
> that can hit it in under ten minutes.
>
> After some debugging I've found that the stripe that hits the BUG_ON is
> hitting a conditional in handle_stripe_fill() for stripes that are in
> the journal with a failed disk[2]. This check was added in 2017 by your
> patch:
>
>    07e83364845e ("md/r5cache: shift complex rmw from read path to write
> path")
>
> A stripe that hits the bug has one injournal dev, and one failed dev and
> does not have STRIPE_R5C_CACHING set and therefore hits the conditional
> and returns from handle_stripe_fill() without calling fetch_block() or
> doing anything else to change the flow of execution. Normally,
> fetch_block() would set STRIPE_COMPUTE_RUN to recompute the missing
> disk, however that gets skipped for this case. After returning from
> handle_stripe_fill(), handle_stripe() will then call
> handle_parity_checks5() because STRIPE_COMPUTE_RUN was not set and this
> will immediately hit the BUG_ON, because nothing has computed the disk
> and set it UPTODATE yet.
>
> I can't say I fully understand the patch that added this, so I don't
> really understand why that conditional is there or what it's trying to
> accomplish and thus I don't know what the correct solution might be.
>
> Any thoughts?

Could you please add some printk so that we know which condition triggered
handle_stripe_fill() here:

        if (s.to_read || s.non_overwrite
            || (s.to_write && s.failed)
            || (s.syncing && (s.uptodate + s.compute < disks))
            || s.replacing
            || s.expanding)
                handle_stripe_fill(sh, &s, disks);

This would help us narrow down to the exact condition. I guess it is
"(s.to_write && s.failed)", but I am not quite sure.

Thanks,
Song



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux