Re: [PATCH] Don't jump to compute_result state from check_result state

Nigel Croxon <ncroxon@xxxxxxxxxx> · Mon, 15 Apr 2019 13:54:50 -0400

On 04/15/2019 01:44 PM, Song Liu wrote:
Hi Nigel,

On Thu, Apr 11, 2019 at 10:21 AM Song Liu <liu.song.a23@xxxxxxxxx> wrote:
On Tue, Apr 9, 2019 at 12:26 PM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote:

On 4/9/19 2:09 PM, John Stoffel wrote:
"Dan" == Dan Williams <dan.j.williams@xxxxxxxxx> writes:
Dan> On Mon, Apr 8, 2019 at 4:18 PM Song Liu <liu.song.a23@xxxxxxxxx> wrote:
Dan> [..]
To trigger this issue, you not only need a failed disk but to also
perform a scrubbing operation.  The customer's systems both crashed
early Sunday morning when the raid-check script is run by default from cron.
Ok, I follow this, but I come to a different answer on the required
fix. I think it is simply the following to preserve the writeback
action after the parity check, because we need the failed Q slot to be
written back if we're recovering. P will be not up-to-date because it
was checked with the good disks, but sh->ops.zero_sum_result will be
0, so that will skip the writeback of a !uptodate P value.

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c033bfcb209e..e2eb59289346 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4187,7 +4187,6 @@ static void handle_parity_checks6(struct r5conf
*conf, struct stripe_head *sh,
                  /* now write out any block on a failed drive,
                   * or P or Q if they were recomputed
                   */
-               BUG_ON(s->uptodate < disks - 1); /* We don't need Q to
recover */
Thanks Dan!

Would it make sense to rework the check as

BUG_ON(s->uptodate < disks - 2);
Dan> I think the problem is that any 'uptodate' vs 'disks' check is
Dan> not precise enough in this path. What might be better is to put
Dan> "WARN_ON(!test_bit(R5_UPTODATE, &dev->flags)" on the devices that
Dan> might try to kick off writes and then skip the action. Better to
Dan> prevent the raid driver from taking unexpected action *and* keep
Dan> the system alive vs killing the machine with BUG_ON.

Dan> BUG_ON has fallen out of favor for exception reporting since
Dan> those assertions were introduced.

And since it' causes the system to crash... it's super annoying when
the rest of the system is working fine.  Please only use a WARN_ON,
and maybe even set the RAID volume readonly, etc.  But don't bring
down the rest of the system if possible.

John
I reverted the first patch as it made its way upstream.

Testing this change now.

---

   drivers/md/raid5.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c033bfcb209e..660ca3af2914 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4187,7 +4187,7 @@ static void handle_parity_checks6(struct r5conf
*conf, struct stripe_head *sh,
           /* now write out any block on a failed drive,
            * or P or Q if they were recomputed
            */
-        BUG_ON(s->uptodate < disks - 1); /* We don't need Q to recover */
+        WARN_ON(s->uptodate < disks - 2); /* We don't need Q to recover */
I think this WARN_ON() is the best way to go, though the comment "don't need Q"
needs some revise.

Nigel, how does this work in your tests?

Thanks,
Song
Could you please share updates about the tests?

Thanks,
Song

Hello Song,
Our customer has confirmed, the issue is fixed. He had run this on two 
systems with defective Disks and run raid-check. There were some errors 
corrected, the device was degraded, but the kernel never crashed.

-Nigel

           if (s->failed == 2) {
               dev = &sh->dev[s->failed_num[1]];
               s->locked++;
--
2.20.1