On Mon, 13 Sep 2010 21:11:30 -0400 Mike Hartman <mike@xxxxxxxxxxxxxxxxxxxx> wrote: > Forgot to include the mailing list on this. > > > Hi Mike, > > thanks for the updates. > > > > I'm not entirely clear what is happening (in fact, due to a cold that I am > > still fighting off, nothing is entirely clear at the moment), but it looks > > very likely that the problem is due to an interplay between barrier handling, > > and the multi-level structure of your array (a raid0 being a member of a > > raid5). > > > > When a barrier request is processed, both arrays will schedule 'work' to be > > done by the 'event' thread and I'm guess that you can get into a situation > > where one work time is wait for the other, but the other is behind the one on > > the single queue (I wonder if that make sense...) > > > > Anyway, this patch might make a difference, It reduced the number of work > > items schedule in a way that could conceivably fix the problem. > > > > If you can test this, please report the results. I cannot easily reproduce > > the problem so there is limited testing that I can do. > > > > Thanks, > > NeilBrown > > > > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > > index f20d13e..7f2785c 100644 > > --- a/drivers/md/md.c > > +++ b/drivers/md/md.c > > @@ -294,6 +294,23 @@ EXPORT_SYMBOL(mddev_congested); > > > > #define POST_REQUEST_BARRIER ((void*)1) > > > > +static void md_barrier_done(mddev_t *mddev) > > +{ > > + struct bio *bio = mddev->barrier; > > + > > + if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) > > + bio_endio(bio, -EOPNOTSUPP); > > + else if (bio->bi_size == 0) > > + bio_endio(bio, 0); > > + else { > > + /* other options need to be handled from process context */ > > + schedule_work(&mddev->barrier_work); > > + return; > > + } > > + mddev->barrier = NULL; > > + wake_up(&mddev->sb_wait); > > +} > > + > > static void md_end_barrier(struct bio *bio, int err) > > { > > mdk_rdev_t *rdev = bio->bi_private; > > @@ -310,7 +327,7 @@ static void md_end_barrier(struct bio *bio, int err) > > wake_up(&mddev->sb_wait); > > } else > > /* The pre-request barrier has finished */ > > - schedule_work(&mddev->barrier_work); > > + md_barrier_done(mddev); > > } > > bio_put(bio); > > } > > @@ -350,18 +367,12 @@ static void md_submit_barrier(struct work_struct *ws) > > > > atomic_set(&mddev->flush_pending, 1); > > > > - if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) > > - bio_endio(bio, -EOPNOTSUPP); > > - else if (bio->bi_size == 0) > > - /* an empty barrier - all done */ > > - bio_endio(bio, 0); > > - else { > > - bio->bi_rw &= ~REQ_HARDBARRIER; > > - if (mddev->pers->make_request(mddev, bio)) > > - generic_make_request(bio); > > - mddev->barrier = POST_REQUEST_BARRIER; > > - submit_barriers(mddev); > > - } > > + bio->bi_rw &= ~REQ_HARDBARRIER; > > + if (mddev->pers->make_request(mddev, bio)) > > + generic_make_request(bio); > > + mddev->barrier = POST_REQUEST_BARRIER; > > + submit_barriers(mddev); > > + > > if (atomic_dec_and_test(&mddev->flush_pending)) { > > mddev->barrier = NULL; > > wake_up(&mddev->sb_wait); > > @@ -383,7 +394,7 @@ void md_barrier_request(mddev_t *mddev, struct bio *bio) > > submit_barriers(mddev); > > > > if (atomic_dec_and_test(&mddev->flush_pending)) > > - schedule_work(&mddev->barrier_work); > > + md_barrier_done(mddev); > > } > > EXPORT_SYMBOL(md_barrier_request); > > > > > > > > Neil, thanks for the patch. I experienced the lockup for the 5th time > an hour ago (about 3 hours after the last hard reboot) so I thought it > would be a good time to try your patch. Unfortunately I'm getting an > error: > > patching file drivers/md/md.c > Hunk #1 succeeded at 291 with fuzz 1 (offset -3 lines). > Hunk #2 FAILED at 324. > Hunk #3 FAILED at 364. > Hunk #4 FAILED at 391. > 3 out of 4 hunks FAILED -- saving rejects to file drivers/md/md.c.rej That is odd. I took the md.c that you posted on the web site, use "patch" to apply my patch to it, and only Hunk #3 failed. I used 'wiggle' to apply the patch and it applied perfectly, properly replacing (1<<BIO_RW_BARRIER) with REQ_HARDBARRIER (or the other way around). Try this version. You will need to be in drivers/md/, or use patch drivers/md/md.c < this-patch NeilBrown --- md.c.orig 2010-09-14 11:29:15.000000000 +1000 +++ md.c 2010-09-14 11:29:50.000000000 +1000 @@ -291,6 +291,23 @@ #define POST_REQUEST_BARRIER ((void*)1) +static void md_barrier_done(mddev_t *mddev) +{ + struct bio *bio = mddev->barrier; + + if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) + bio_endio(bio, -EOPNOTSUPP); + else if (bio->bi_size == 0) + bio_endio(bio, 0); + else { + /* other options need to be handled from process context */ + schedule_work(&mddev->barrier_work); + return; + } + mddev->barrier = NULL; + wake_up(&mddev->sb_wait); +} + static void md_end_barrier(struct bio *bio, int err) { mdk_rdev_t *rdev = bio->bi_private; @@ -307,7 +324,7 @@ wake_up(&mddev->sb_wait); } else /* The pre-request barrier has finished */ - schedule_work(&mddev->barrier_work); + md_barrier_done(mddev); } bio_put(bio); } @@ -347,18 +364,12 @@ atomic_set(&mddev->flush_pending, 1); - if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) - bio_endio(bio, -EOPNOTSUPP); - else if (bio->bi_size == 0) - /* an empty barrier - all done */ - bio_endio(bio, 0); - else { - bio->bi_rw &= ~(1<<BIO_RW_BARRIER); - if (mddev->pers->make_request(mddev, bio)) - generic_make_request(bio); - mddev->barrier = POST_REQUEST_BARRIER; - submit_barriers(mddev); - } + bio->bi_rw &= ~(1<<BIO_RW_BARRIER); + if (mddev->pers->make_request(mddev, bio)) + generic_make_request(bio); + mddev->barrier = POST_REQUEST_BARRIER; + submit_barriers(mddev); + if (atomic_dec_and_test(&mddev->flush_pending)) { mddev->barrier = NULL; wake_up(&mddev->sb_wait); @@ -380,7 +391,7 @@ submit_barriers(mddev); if (atomic_dec_and_test(&mddev->flush_pending)) - schedule_work(&mddev->barrier_work); + md_barrier_done(mddev); } EXPORT_SYMBOL(md_barrier_request); -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html