On Monday May 19, snitzer@xxxxxxxxx wrote: > > Hi Neil, > > Sorry about not getting back with you sooner. Thanks for putting > significant time to chasing this problem. > > I tested your most recent patch and unfortunately still hit the case > where the nbd member becomes degraded yet the array continues to clear > bits (events_cleared of the non-degraded member is higher than the > degraded member). Is this behavior somehow expected/correct? It shouldn't be..... ahhh. There is a delay between noting that the bit can be cleared, and actually writing the zero to disk. This is obviously intentional in case the bit gets set again quickly. I'm sampling the event count at the latter point instead of the former, and there is time for it to change. Maybe this patch on top of what I recently sent out? Thanks, NeilBrown Signed-off-by: Neil Brown <neilb@xxxxxxx> ### Diffstat output ./drivers/md/bitmap.c | 10 ++++++++-- ./include/linux/raid/bitmap.h | 1 + 2 files changed, 9 insertions(+), 2 deletions(-) diff .prev/drivers/md/bitmap.c ./drivers/md/bitmap.c --- .prev/drivers/md/bitmap.c 2008-05-19 15:23:42.000000000 +1000 +++ ./drivers/md/bitmap.c 2008-05-19 15:24:56.000000000 +1000 @@ -1092,9 +1092,9 @@ void bitmap_daemon_work(struct bitmap *b /* We are possibly going to clear some bits, so make * sure that events_cleared is up-to-date. */ - if (bitmap->events_cleared < bitmap->mddev->events) { + if (bitmap->need_sync) { bitmap_super_t *sb; - bitmap->events_cleared = bitmap->mddev->events; + bitmap->need_sync = 0; wait_event(bitmap->mddev->sb_wait, !test_bit(MD_CHANGE_CLEAN, &bitmap->mddev->flags)); @@ -1273,6 +1273,12 @@ void bitmap_endwrite(struct bitmap *bitm return; } + if (success && + bitmap->events_cleared < bitmap->mddev->events) { + bitmap->events_cleared = bitmap->mddev->events; + bitmap->need_sync = 1; + } + if (!success && ! (*bmc & NEEDED_MASK)) *bmc |= NEEDED_MASK; diff .prev/include/linux/raid/bitmap.h ./include/linux/raid/bitmap.h --- .prev/include/linux/raid/bitmap.h 2008-05-19 15:23:50.000000000 +1000 +++ ./include/linux/raid/bitmap.h 2008-05-19 15:24:56.000000000 +1000 @@ -221,6 +221,7 @@ struct bitmap { unsigned long syncchunk; __u64 events_cleared; + int need_sync; /* bitmap spinlock */ spinlock_t lock; -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html