Re: [PATCH] Don't jump to compute_result state from check_result state

Song Liu <liu.song.a23@xxxxxxxxx> · Mon, 15 Apr 2019 12:24:23 -0700

On Mon, Apr 15, 2019 at 10:54 AM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote:
>
>
>
> On 04/15/2019 01:44 PM, Song Liu wrote:
> > Hi Nigel,
> >
> > On Thu, Apr 11, 2019 at 10:21 AM Song Liu <liu.song.a23@xxxxxxxxx> wrote:
> >> On Tue, Apr 9, 2019 at 12:26 PM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote:
> >>>
> >>> On 4/9/19 2:09 PM, John Stoffel wrote:
> >>>>>>>>> "Dan" == Dan Williams <dan.j.williams@xxxxxxxxx> writes:
> >>>> Dan> On Mon, Apr 8, 2019 at 4:18 PM Song Liu <liu.song.a23@xxxxxxxxx> wrote:
> >>>> Dan> [..]
> >>>>>>>> To trigger this issue, you not only need a failed disk but to also
> >>>>>>>> perform a scrubbing operation.  The customer's systems both crashed
> >>>>>>>> early Sunday morning when the raid-check script is run by default from cron.
> >>>>>>> Ok, I follow this, but I come to a different answer on the required
> >>>>>>> fix. I think it is simply the following to preserve the writeback
> >>>>>>> action after the parity check, because we need the failed Q slot to be
> >>>>>>> written back if we're recovering. P will be not up-to-date because it
> >>>>>>> was checked with the good disks, but sh->ops.zero_sum_result will be
> >>>>>>> 0, so that will skip the writeback of a !uptodate P value.
> >>>>>>>
> >>>>>>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> >>>>>>> index c033bfcb209e..e2eb59289346 100644
> >>>>>>> --- a/drivers/md/raid5.c
> >>>>>>> +++ b/drivers/md/raid5.c
> >>>>>>> @@ -4187,7 +4187,6 @@ static void handle_parity_checks6(struct r5conf
> >>>>>>> *conf, struct stripe_head *sh,
> >>>>>>>                   /* now write out any block on a failed drive,
> >>>>>>>                    * or P or Q if they were recomputed
> >>>>>>>                    */
> >>>>>>> -               BUG_ON(s->uptodate < disks - 1); /* We don't need Q to
> >>>>>>> recover */
> >>>>>> Thanks Dan!
> >>>>>>
> >>>>>> Would it make sense to rework the check as
> >>>>>>
> >>>>>> BUG_ON(s->uptodate < disks - 2);
> >>>> Dan> I think the problem is that any 'uptodate' vs 'disks' check is
> >>>> Dan> not precise enough in this path. What might be better is to put
> >>>> Dan> "WARN_ON(!test_bit(R5_UPTODATE, &dev->flags)" on the devices that
> >>>> Dan> might try to kick off writes and then skip the action. Better to
> >>>> Dan> prevent the raid driver from taking unexpected action *and* keep
> >>>> Dan> the system alive vs killing the machine with BUG_ON.
> >>>>
> >>>> Dan> BUG_ON has fallen out of favor for exception reporting since
> >>>> Dan> those assertions were introduced.
> >>>>
> >>>> And since it' causes the system to crash... it's super annoying when
> >>>> the rest of the system is working fine.  Please only use a WARN_ON,
> >>>> and maybe even set the RAID volume readonly, etc.  But don't bring
> >>>> down the rest of the system if possible.
> >>>>
> >>>> John
> >>> I reverted the first patch as it made its way upstream.
> >>>
> >>> Testing this change now.
> >>>
> >>> ---
> >>>
> >>>    drivers/md/raid5.c | 2 +-
> >>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> >>> index c033bfcb209e..660ca3af2914 100644
> >>> --- a/drivers/md/raid5.c
> >>> +++ b/drivers/md/raid5.c
> >>> @@ -4187,7 +4187,7 @@ static void handle_parity_checks6(struct r5conf
> >>> *conf, struct stripe_head *sh,
> >>>            /* now write out any block on a failed drive,
> >>>             * or P or Q if they were recomputed
> >>>             */
> >>> -        BUG_ON(s->uptodate < disks - 1); /* We don't need Q to recover */
> >>> +        WARN_ON(s->uptodate < disks - 2); /* We don't need Q to recover */
> >> I think this WARN_ON() is the best way to go, though the comment "don't need Q"
> >> needs some revise.
> >>
> >> Nigel, how does this work in your tests?
> >>
> >> Thanks,
> >> Song
> > Could you please share updates about the tests?
> >
> > Thanks,
> > Song
>
> Hello Song,
> Our customer has confirmed, the issue is fixed. He had run this on two
> systems with defective Disks and run raid-check. There were some errors
> corrected, the device was degraded, but the kernel never crashed.
>
> -Nigel

Thanks Nigel!

Could you please submit official patch?

Song

>
> >>>            if (s->failed == 2) {
> >>>                dev = &sh->dev[s->failed_num[1]];
> >>>                s->locked++;
> >>> --
> >>> 2.20.1
> >>>
>