Re: Incorrect in-kernel bitmap on raid10

"Mario 'BitKoenig' Holbe" <Mario.Holbe@xxxxxxxxxxxxx> · Fri, 1 May 2009 19:55:24 +0200

On Fri, May 01, 2009 at 12:11:43PM +1000, Neil Brown wrote:
> There some other places
> where are are overflowing on a shift.  One of those (in
> bitmap_dirty_bits) can cause the problem you see.
> This patch should fix it.  Please confirm.

Together with the small syntax-fix attached this patch fixes the
allocation of half of the available pages only. Now, all pages are
allocated when I set all bits and they all get cleaned in-kernel as well
as on-disk.

However, can you confirm that the bitmap is really used in raid10
resync? I removed half of the disks (a correctly removable subset, of
course :)), copied 100G to the degraded array, got about 7k bit set in
the bitmap, (re-)added the removed devices (mdadm correctly states
re-add as well), but the resync looks *very* sequential.
Moreover: I stopped and re-assembled the array with about 2k bit left
set and the resync starts from the beginning, I can see no skip to the
previous position in the resync process.
I'll try to watch this and will trigger you again when I have more
stable evidence, but perhaps you have some faster test-cases, I have to
wait for at least 5 hours now :)


regards
   Mario
-- 
Singing is the lowest form of communication.
                         -- Homer J. Simpson
diff -urN a/drivers/md/bitmap.c b/drivers/md/bitmap.c

--- a/drivers/md/bitmap.c	2009-05-01 12:50:48.463877165 +0200
+++ b/drivers/md/bitmap.c	2009-05-01 12:55:56.185432118 +0200
@@ -1021,7 +1021,6 @@
 			bitmap_set_memory_bits(bitmap,
 					       (sector_t)i << CHUNK_BLOCK_SHIFT(bitmap),
 					       needed);
-				);
 			bit_cnt++;
 			set_page_attr(bitmap, page, BITMAP_PAGE_CLEAN);
 		}
Attachment:
signature.asc

Description: Digital signature