RE: New RAID causing system lockups

Mike Hartman <mike@xxxxxxxxxxxxxxxxxxxx> · Mon, 13 Sep 2010 21:11:30 -0400

Forgot to include the mailing list on this.

> Hi Mike,
>  thanks for the updates.
>
> I'm not entirely clear what is happening (in fact, due to a cold that I am
> still fighting off, nothing is entirely clear at the moment), but it looks
> very likely that the problem is due to an interplay between barrier handling,
> and the multi-level structure of your array (a raid0 being a member of a
> raid5).
>
> When a barrier request is processed, both arrays will schedule 'work' to be
> done by the 'event' thread and I'm guess that you can get into a situation
> where one work time is wait for the other, but the other is behind the one on
> the single queue (I wonder if that make sense...)
>
> Anyway, this patch might make a difference,  It reduced the number of work
> items schedule in a way that could conceivably fix the problem.
>
> If you can test this, please report the results.  I cannot easily reproduce
> the problem so there is limited testing that I can do.
>
> Thanks,
> NeilBrown
>
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index f20d13e..7f2785c 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -294,6 +294,23 @@ EXPORT_SYMBOL(mddev_congested);
>
>  #define POST_REQUEST_BARRIER ((void*)1)
>
> +static void md_barrier_done(mddev_t *mddev)
> +{
> +       struct bio *bio = mddev->barrier;
> +
> +       if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags))
> +               bio_endio(bio, -EOPNOTSUPP);
> +       else if (bio->bi_size == 0)
> +               bio_endio(bio, 0);
> +       else {
> +               /* other options need to be handled from process context */
> +               schedule_work(&mddev->barrier_work);
> +               return;
> +       }
> +       mddev->barrier = NULL;
> +       wake_up(&mddev->sb_wait);
> +}
> +
>  static void md_end_barrier(struct bio *bio, int err)
>  {
>        mdk_rdev_t *rdev = bio->bi_private;
> @@ -310,7 +327,7 @@ static void md_end_barrier(struct bio *bio, int err)
>                        wake_up(&mddev->sb_wait);
>                } else
>                        /* The pre-request barrier has finished */
> -                       schedule_work(&mddev->barrier_work);
> +                       md_barrier_done(mddev);
>        }
>        bio_put(bio);
>  }
> @@ -350,18 +367,12 @@ static void md_submit_barrier(struct work_struct *ws)
>
>        atomic_set(&mddev->flush_pending, 1);
>
> -       if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags))
> -               bio_endio(bio, -EOPNOTSUPP);
> -       else if (bio->bi_size == 0)
> -               /* an empty barrier - all done */
> -               bio_endio(bio, 0);
> -       else {
> -               bio->bi_rw &= ~REQ_HARDBARRIER;
> -               if (mddev->pers->make_request(mddev, bio))
> -                       generic_make_request(bio);
> -               mddev->barrier = POST_REQUEST_BARRIER;
> -               submit_barriers(mddev);
> -       }
> +       bio->bi_rw &= ~REQ_HARDBARRIER;
> +       if (mddev->pers->make_request(mddev, bio))
> +               generic_make_request(bio);
> +       mddev->barrier = POST_REQUEST_BARRIER;
> +       submit_barriers(mddev);
> +
>        if (atomic_dec_and_test(&mddev->flush_pending)) {
>                mddev->barrier = NULL;
>                wake_up(&mddev->sb_wait);
> @@ -383,7 +394,7 @@ void md_barrier_request(mddev_t *mddev, struct bio *bio)
>        submit_barriers(mddev);
>
>        if (atomic_dec_and_test(&mddev->flush_pending))
> -               schedule_work(&mddev->barrier_work);
> +               md_barrier_done(mddev);
>  }
>  EXPORT_SYMBOL(md_barrier_request);
>
>
>

Neil, thanks for the patch. I experienced the lockup for the 5th time
an hour ago (about 3 hours after the last hard reboot) so I thought it
would be a good time to try your patch. Unfortunately I'm getting an
error:

patching file drivers/md/md.c
Hunk #1 succeeded at 291 with fuzz 1 (offset -3 lines).
Hunk #2 FAILED at 324.
Hunk #3 FAILED at 364.
Hunk #4 FAILED at 391.
3 out of 4 hunks FAILED -- saving rejects to file drivers/md/md.c.rej

"uname -r" gives "2.6.35-gentoo-r4", so I suspect that's why. I guess
the standard gentoo patchset does something with that file. I'm
skimming through md.c to see if I can understand it well enough to
apply the patch functionality manually. I've also uploaded my
2.6.35-gentoo-r4 md.c to www.hartmanipulation.com/raid/ with the other
files in case you or someone else wants to take a look at it.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html