On Mon, 17 Sep 2012 10:37:55 +0200 Lukasz Dorau <lukasz.dorau@xxxxxxxxx> wrote: > Mdmon has not read component_size (in read_and_act()) on wakeup so far. > In case of size's expansion component_size has been changed but > mdmon has not updated it. As a result the metadata was incorrect > during the size's expansion. It has contained no information > that resync was in progress (there has been no checkpoint too). > The metadata has been as if resync has already been finished > but it has not. > > Updating of component_size has been added to read_and_act(). > Now mdmon uses the correct value of component_size and the correct > metadata (containing information about resync and checkpoint) is written. Thanks, but I don't think I like this approach. In my mind, the 'monitor' thread should only be monitoring things that it has to respond to quickly - before another write can complete. A size change does not need to be so immediate. So I would rather that mdadm updates component_size but not array_size, and then 'ping's the manager. The manager thread then checks and notices that the component size has changed, asks the monitor to update the metadata, then sets the array_size to match. Possibly mdadm shouldn't just 'ping' the monitor, but should send the metadata update ... not sure which is best. Does that make sense to you? Thanks, NeilBrown > > Signed-off-by: Lukasz Dorau <lukasz.dorau@xxxxxxxxx> > --- > managemon.c | 6 +++++- > mdmon.h | 1 + > monitor.c | 12 ++++++++++++ > 3 files changed, 18 insertions(+), 1 deletions(-) > > diff --git a/managemon.c b/managemon.c > index ef351b3..fd41463 100644 > --- a/managemon.c > +++ b/managemon.c > @@ -123,6 +123,8 @@ static void close_aa(struct active_array *aa) > close(aa->info.state_fd); > if (aa->resync_start_fd >= 0) > close(aa->resync_start_fd); > + if (aa->component_size_fd >= 0) > + close(aa->component_size_fd); > if (aa->metadata_fd >= 0) > close(aa->metadata_fd); > if (aa->sync_completed_fd >= 0) > @@ -598,7 +600,8 @@ static int aa_ready(struct active_array *aa) > if (aa->info.state_fd < 0) > return 0; > > - if (level > 0 && (aa->action_fd < 0 || aa->resync_start_fd < 0)) > + if (level > 0 && (aa->action_fd < 0 || aa->resync_start_fd < 0 \ > + || aa->component_size_fd < 0)) > return 0; > > if (!aa->container) > @@ -676,6 +679,7 @@ static void manage_new(struct mdstat_ent *mdstat, > new->action_fd = sysfs_open(new->devnum, NULL, "sync_action"); > new->info.state_fd = sysfs_open(new->devnum, NULL, "array_state"); > new->resync_start_fd = sysfs_open(new->devnum, NULL, "resync_start"); > + new->component_size_fd = sysfs_open(new->devnum, NULL, "component_size"); > new->metadata_fd = sysfs_open(new->devnum, NULL, "metadata_version"); > new->sync_completed_fd = sysfs_open(new->devnum, NULL, "sync_completed"); > dprintf("%s: inst: %d action: %d state: %d\n", __func__, atoi(inst), > diff --git a/mdmon.h b/mdmon.h > index 59e1b53..0b82598 100644 > --- a/mdmon.h > +++ b/mdmon.h > @@ -32,6 +32,7 @@ struct active_array { > > int action_fd; > int resync_start_fd; > + int component_size_fd; /* in case of change of array's size */ > int metadata_fd; /* for monitoring rw/ro status */ > int sync_completed_fd; /* for checkpoint notification events */ > unsigned long long last_checkpoint; /* sync_completed fires for many > diff --git a/monitor.c b/monitor.c > index c987d10..aa58384 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -80,6 +80,17 @@ static unsigned long long read_resync_start(int fd) > return strtoull(buf, NULL, 10); > } > > +static unsigned long long read_component_size(int fd) > +{ > + char buf[30]; > + int n; > + > + n = read_attr(buf, 30, fd); > + if (n <= 0) > + return 0; > + return strtoull(buf, NULL, 10); > +} > + > static unsigned long long read_sync_completed(int fd) > { > unsigned long long val; > @@ -229,6 +240,7 @@ static int read_and_act(struct active_array *a) > a->curr_state = read_state(a->info.state_fd); > a->curr_action = read_action(a->action_fd); > a->info.resync_start = read_resync_start(a->resync_start_fd); > + a->info.component_size = read_component_size(a->component_size_fd) << 1; > sync_completed = read_sync_completed(a->sync_completed_fd); > for (mdi = a->info.devs; mdi ; mdi = mdi->next) { > mdi->next_state = 0; > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature