On Wed, 2012-11-21 at 08:43 +1100, NeilBrown wrote: > On Tue, 20 Nov 2012 09:55:41 -0800 Ross Boylan <ross@xxxxxxxxxxxxxxxx> wrote: > > > While switching the disks a RAID 1 is based on I used the --wait command > > to wait for the rebuild to finish. It returned immediately, but a > > subsequent query showed it had not been rebuilt. Have I misunderstood > > something, or is this an error? > > > > While doing these commands a much larger rebuild was going on with a > > different array, involving some of the same physical disks but different > > partitions. The partitions being rebuilt are on different physical > > disks for the different arrays. > > > > Here are the logs, with version info at the end (Debian Lenny + more > > recent kernel): > .... > > > markov:~# uname -a > > Linux markov 2.6.32-5-amd64 #1 SMP Wed Jan 12 03:40:32 UTC 2011 x86_64 GNU/Linux > > markov:~# mdadm --version > > mdadm - v2.6.7.2 - 14th November 2008 > > > > > > I notice that in this case, unlike the other array, the message during > > the rebuild (the last detail report) does not include a line like > > Rebuild Status : 0% complete > > > > I just tried --wait again to see if there was some kind of race, but > > once again it returned immediately, though detail says the spare is > > rebuilding. > > Can you test this patch to see if it fixes the problem? > > diff --git a/Monitor.c b/Monitor.c > index c4d57c3..a5e7aaa 100644 > --- a/Monitor.c > +++ b/Monitor.c > @@ -973,7 +973,7 @@ int Wait(char *dev) > if (e->devnum == devnum) > break; > > - if (!e || e->percent < 0) { > + if (!e || e->percent == RESYNC_NONE) { > if (e && e->metadata_version && > strncmp(e->metadata_version, "external:", 9) == 0) { > if (is_subarray(&e->metadata_version[9])) > > > NeilBrown My source for 2.6.7.2 looks somewhat different. It only has 627 lines; I think this is the relevant code (at the end of the file): /* Not really Monitor but ... */ int Wait(char *dev) { struct stat stb; int devnum; int rv = 1; if (stat(dev, &stb) != 0) { fprintf(stderr, Name ": Cannot find %s: %s\n", dev, strerror(errno)); return 2; } if (major(stb.st_rdev) == MD_MAJOR) devnum = minor(stb.st_rdev); else devnum = -1-(minor(stb.st_rdev)/64); while(1) { struct mdstat_ent *ms = mdstat_read(1, 0); struct mdstat_ent *e; for (e=ms ; e; e=e->next) if (e->devnum == devnum) break; if (!e || e->percent < 0) { free_mdstat(ms); return rv; } free(ms); rv = 0; mdstat_wait(5); } } The section if (!e || e->percent < 0) { free_mdstat(ms); return rv; is the only one with e->percent < 0. Is it OK to change that to if (!e || e->percent == RESYNC_NONE) {? Thanks. Ross -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html