On Tuesday June 20, nigel@xxxxxxxxxxxxxx wrote: > Nigel J. Terry wrote: > > Well good news and bad news I'm afraid... > > Well I would like to be able to tell you that the time calculation now > works, but I can't. Here's why: Why I rebooted with the newly built > kernel, it decided to hit the magic 21 reboots and hence decided to > check the array for clean. The normally takes about 5-10 mins, but this > time took several hours, so I went to bed! I suspect that it was doing > the full reshape or something similar at boot time. > What "magic 21 reboots"?? md has no mechanism to automatically check the array after N reboots or anything like that. Or are you thinking of the 'fsck' that does a full check every so-often? > Now I am not sure that this makes good sense in a normal environment. > This could keep a server down for hours or days. I might suggest that if > such work was required, the clean check is postponed till next boot and > the reshape allowed to continue in the background. An fsck cannot tell if there is a reshape happening, but the reshape should notice the fsck and slow down to a crawl so the fsck can complete... > > Anyway the good news is that this morning, all is well, the array is > clean and grown as can be seen below. However, if you look further below > you will see the section from dmesg which still shows RIP errors, so I > guess there is still something wrong, even though it looks like it is > working. Let me know if i can provide any more information. > > Once again, many thanks. All I need to do now is grow the ext3 filesystem... ..... > ...ok start reshape thread > md: syncing RAID array md0 > md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. > md: using maximum available idle IO bandwidth (but not more than 200000 > KB/sec) for reconstruction. > md: using 128k window, over a total of 245111552 blocks. > Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: > <0000000000000000>{stext+2145382632} > PGD 7c3f9067 PUD 7cb9e067 PMD 0 .... > Process md0_reshape (pid: 1432, threadinfo ffff81007aa42000, task > ffff810037f497b0) > Stack: ffffffff803dce42 0000000000000000 000000001d383600 0000000000000000 > 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 0000000000000000 > Call Trace: <ffffffff803dce42>{md_do_sync+1307} > <ffffffff802640c0>{thread_return+0} > <ffffffff8026411e>{thread_return+94} > <ffffffff8029925d>{keventd_create_kthread+0} > <ffffffff803dd3d9>{md_thread+248} That looks very much like the bug that I already sent you a patch for! Are you sure that the new kernel still had this patch? I'm a bit confused by this.... NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html