>> >> > Hi Mike, >> > thanks for the updates. >> > >> > I'm not entirely clear what is happening (in fact, due to a cold that I am >> > still fighting off, nothing is entirely clear at the moment), but it looks >> > very likely that the problem is due to an interplay between barrier handling, >> > and the multi-level structure of your array (a raid0 being a member of a >> > raid5). >> > >> > When a barrier request is processed, both arrays will schedule 'work' to be >> > done by the 'event' thread and I'm guess that you can get into a situation >> > where one work time is wait for the other, but the other is behind the one on >> > the single queue (I wonder if that make sense...) >> > >> > Anyway, this patch might make a difference, It reduced the number of work >> > items schedule in a way that could conceivably fix the problem. >> > >> > If you can test this, please report the results. I cannot easily reproduce >> > the problem so there is limited testing that I can do. >> > >> > Thanks, >> > NeilBrown >> > >> > >> > diff --git a/drivers/md/md.c b/drivers/md/md.c >> > index f20d13e..7f2785c 100644 >> > --- a/drivers/md/md.c >> > +++ b/drivers/md/md.c >> > @@ -294,6 +294,23 @@ EXPORT_SYMBOL(mddev_congested); >> > >> > #define POST_REQUEST_BARRIER ((void*)1) >> > >> > +static void md_barrier_done(mddev_t *mddev) >> > +{ >> > + struct bio *bio = mddev->barrier; >> > + >> > + if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) >> > + bio_endio(bio, -EOPNOTSUPP); >> > + else if (bio->bi_size == 0) >> > + bio_endio(bio, 0); >> > + else { >> > + /* other options need to be handled from process context */ >> > + schedule_work(&mddev->barrier_work); >> > + return; >> > + } >> > + mddev->barrier = NULL; >> > + wake_up(&mddev->sb_wait); >> > +} >> > + >> > static void md_end_barrier(struct bio *bio, int err) >> > { >> > mdk_rdev_t *rdev = bio->bi_private; >> > @@ -310,7 +327,7 @@ static void md_end_barrier(struct bio *bio, int err) >> > wake_up(&mddev->sb_wait); >> > } else >> > /* The pre-request barrier has finished */ >> > - schedule_work(&mddev->barrier_work); >> > + md_barrier_done(mddev); >> > } >> > bio_put(bio); >> > } >> > @@ -350,18 +367,12 @@ static void md_submit_barrier(struct work_struct *ws) >> > >> > atomic_set(&mddev->flush_pending, 1); >> > >> > - if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) >> > - bio_endio(bio, -EOPNOTSUPP); >> > - else if (bio->bi_size == 0) >> > - /* an empty barrier - all done */ >> > - bio_endio(bio, 0); >> > - else { >> > - bio->bi_rw &= ~REQ_HARDBARRIER; >> > - if (mddev->pers->make_request(mddev, bio)) >> > - generic_make_request(bio); >> > - mddev->barrier = POST_REQUEST_BARRIER; >> > - submit_barriers(mddev); >> > - } >> > + bio->bi_rw &= ~REQ_HARDBARRIER; >> > + if (mddev->pers->make_request(mddev, bio)) >> > + generic_make_request(bio); >> > + mddev->barrier = POST_REQUEST_BARRIER; >> > + submit_barriers(mddev); >> > + >> > if (atomic_dec_and_test(&mddev->flush_pending)) { >> > mddev->barrier = NULL; >> > wake_up(&mddev->sb_wait); >> > @@ -383,7 +394,7 @@ void md_barrier_request(mddev_t *mddev, struct bio *bio) >> > submit_barriers(mddev); >> > >> > if (atomic_dec_and_test(&mddev->flush_pending)) >> > - schedule_work(&mddev->barrier_work); >> > + md_barrier_done(mddev); >> > } >> > EXPORT_SYMBOL(md_barrier_request); >> > >> > >> > >> >> Neil, thanks for the patch. I experienced the lockup for the 5th time >> an hour ago (about 3 hours after the last hard reboot) so I thought it >> would be a good time to try your patch. Unfortunately I'm getting an >> error: >> >> patching file drivers/md/md.c >> Hunk #1 succeeded at 291 with fuzz 1 (offset -3 lines). >> Hunk #2 FAILED at 324. >> Hunk #3 FAILED at 364. >> Hunk #4 FAILED at 391. >> 3 out of 4 hunks FAILED -- saving rejects to file drivers/md/md.c.rej > > That is odd. > I took the md.c that you posted on the web site, use "patch" to apply my > patch to it, and only Hunk #3 failed. > > I used 'wiggle' to apply the patch and it applied perfectly, properly > replacing (1<<BIO_RW_BARRIER) with REQ_HARDBARRIER (or the other way around). > > Try this version. You will need to be in drivers/md/, or use > > patch drivers/md/md.c < this-patch > > > NeilBrown > > --- md.c.orig 2010-09-14 11:29:15.000000000 +1000 > +++ md.c 2010-09-14 11:29:50.000000000 +1000 > @@ -291,6 +291,23 @@ > > #define POST_REQUEST_BARRIER ((void*)1) > > +static void md_barrier_done(mddev_t *mddev) > +{ > + struct bio *bio = mddev->barrier; > + > + if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) > + bio_endio(bio, -EOPNOTSUPP); > + else if (bio->bi_size == 0) > + bio_endio(bio, 0); > + else { > + /* other options need to be handled from process context */ > + schedule_work(&mddev->barrier_work); > + return; > + } > + mddev->barrier = NULL; > + wake_up(&mddev->sb_wait); > +} > + > static void md_end_barrier(struct bio *bio, int err) > { > mdk_rdev_t *rdev = bio->bi_private; > @@ -307,7 +324,7 @@ > wake_up(&mddev->sb_wait); > } else > /* The pre-request barrier has finished */ > - schedule_work(&mddev->barrier_work); > + md_barrier_done(mddev); > } > bio_put(bio); > } > @@ -347,18 +364,12 @@ > > atomic_set(&mddev->flush_pending, 1); > > - if (test_bit(BIO_EOPNOTSUPP, &bio->bi_flags)) > - bio_endio(bio, -EOPNOTSUPP); > - else if (bio->bi_size == 0) > - /* an empty barrier - all done */ > - bio_endio(bio, 0); > - else { > - bio->bi_rw &= ~(1<<BIO_RW_BARRIER); > - if (mddev->pers->make_request(mddev, bio)) > - generic_make_request(bio); > - mddev->barrier = POST_REQUEST_BARRIER; > - submit_barriers(mddev); > - } > + bio->bi_rw &= ~(1<<BIO_RW_BARRIER); > + if (mddev->pers->make_request(mddev, bio)) > + generic_make_request(bio); > + mddev->barrier = POST_REQUEST_BARRIER; > + submit_barriers(mddev); > + > if (atomic_dec_and_test(&mddev->flush_pending)) { > mddev->barrier = NULL; > wake_up(&mddev->sb_wait); > @@ -380,7 +391,7 @@ > submit_barriers(mddev); > > if (atomic_dec_and_test(&mddev->flush_pending)) > - schedule_work(&mddev->barrier_work); > + md_barrier_done(mddev); > } > EXPORT_SYMBOL(md_barrier_request); > > Sorry about that Neil, it was my fault. I copied your patch out of the email and I think it picked up some unintended characters. I tried copying it from the mailing list archive website instead and it patched in fine. The kernel compiled with no trouble and I'm booted into it now. No unexpected side-effects yet. I'll continue with my copying and we'll see if it locks up again. Thanks for all your help! Mike -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html