On Mon, Apr 15, 2019 at 10:54 AM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote: > > > > On 04/15/2019 01:44 PM, Song Liu wrote: > > Hi Nigel, > > > > On Thu, Apr 11, 2019 at 10:21 AM Song Liu <liu.song.a23@xxxxxxxxx> wrote: > >> On Tue, Apr 9, 2019 at 12:26 PM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote: > >>> > >>> On 4/9/19 2:09 PM, John Stoffel wrote: > >>>>>>>>> "Dan" == Dan Williams <dan.j.williams@xxxxxxxxx> writes: > >>>> Dan> On Mon, Apr 8, 2019 at 4:18 PM Song Liu <liu.song.a23@xxxxxxxxx> wrote: > >>>> Dan> [..] > >>>>>>>> To trigger this issue, you not only need a failed disk but to also > >>>>>>>> perform a scrubbing operation. The customer's systems both crashed > >>>>>>>> early Sunday morning when the raid-check script is run by default from cron. > >>>>>>> Ok, I follow this, but I come to a different answer on the required > >>>>>>> fix. I think it is simply the following to preserve the writeback > >>>>>>> action after the parity check, because we need the failed Q slot to be > >>>>>>> written back if we're recovering. P will be not up-to-date because it > >>>>>>> was checked with the good disks, but sh->ops.zero_sum_result will be > >>>>>>> 0, so that will skip the writeback of a !uptodate P value. > >>>>>>> > >>>>>>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > >>>>>>> index c033bfcb209e..e2eb59289346 100644 > >>>>>>> --- a/drivers/md/raid5.c > >>>>>>> +++ b/drivers/md/raid5.c > >>>>>>> @@ -4187,7 +4187,6 @@ static void handle_parity_checks6(struct r5conf > >>>>>>> *conf, struct stripe_head *sh, > >>>>>>> /* now write out any block on a failed drive, > >>>>>>> * or P or Q if they were recomputed > >>>>>>> */ > >>>>>>> - BUG_ON(s->uptodate < disks - 1); /* We don't need Q to > >>>>>>> recover */ > >>>>>> Thanks Dan! > >>>>>> > >>>>>> Would it make sense to rework the check as > >>>>>> > >>>>>> BUG_ON(s->uptodate < disks - 2); > >>>> Dan> I think the problem is that any 'uptodate' vs 'disks' check is > >>>> Dan> not precise enough in this path. What might be better is to put > >>>> Dan> "WARN_ON(!test_bit(R5_UPTODATE, &dev->flags)" on the devices that > >>>> Dan> might try to kick off writes and then skip the action. Better to > >>>> Dan> prevent the raid driver from taking unexpected action *and* keep > >>>> Dan> the system alive vs killing the machine with BUG_ON. > >>>> > >>>> Dan> BUG_ON has fallen out of favor for exception reporting since > >>>> Dan> those assertions were introduced. > >>>> > >>>> And since it' causes the system to crash... it's super annoying when > >>>> the rest of the system is working fine. Please only use a WARN_ON, > >>>> and maybe even set the RAID volume readonly, etc. But don't bring > >>>> down the rest of the system if possible. > >>>> > >>>> John > >>> I reverted the first patch as it made its way upstream. > >>> > >>> Testing this change now. > >>> > >>> --- > >>> > >>> drivers/md/raid5.c | 2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > >>> index c033bfcb209e..660ca3af2914 100644 > >>> --- a/drivers/md/raid5.c > >>> +++ b/drivers/md/raid5.c > >>> @@ -4187,7 +4187,7 @@ static void handle_parity_checks6(struct r5conf > >>> *conf, struct stripe_head *sh, > >>> /* now write out any block on a failed drive, > >>> * or P or Q if they were recomputed > >>> */ > >>> - BUG_ON(s->uptodate < disks - 1); /* We don't need Q to recover */ > >>> + WARN_ON(s->uptodate < disks - 2); /* We don't need Q to recover */ > >> I think this WARN_ON() is the best way to go, though the comment "don't need Q" > >> needs some revise. > >> > >> Nigel, how does this work in your tests? > >> > >> Thanks, > >> Song > > Could you please share updates about the tests? > > > > Thanks, > > Song > > Hello Song, > Our customer has confirmed, the issue is fixed. He had run this on two > systems with defective Disks and run raid-check. There were some errors > corrected, the device was degraded, but the kernel never crashed. > > -Nigel Thanks Nigel! Could you please submit official patch? Song > > >>> if (s->failed == 2) { > >>> dev = &sh->dev[s->failed_num[1]]; > >>> s->locked++; > >>> -- > >>> 2.20.1 > >>> >