> > -----Original Message----- > From: linux-kernel-owner@xxxxxxxxxxxxxxx > [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] On Behalf Of NeilBrown > Sent: Sunday, March 02, 2008 5:18 PM > To: Andrew Morton > Cc: linux-raid@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; K.Tanaka > Subject: [PATCH 008 of 9] md: Fix possible raid1/raid10 deadlock on read > error during resync. > > diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c > --- .prev/drivers/md/raid1.c 2008-03-03 11:03:39.000000000 +1100 > +++ ./drivers/md/raid1.c 2008-03-03 09:56:52.000000000 +1100 > @@ -704,13 +704,20 @@ static void freeze_array(conf_t *conf) > /* stop syncio and normal IO and wait for everything to > * go quite. > * We increment barrier and nr_waiting, and then > - * wait until barrier+nr_pending match nr_queued+2 > + * wait until nr_pending match nr_queued+1 > + * This is called in the context of one normal IO request > + * that has failed. Thus any sync request that might be pending > + * will be blocked by nr_pending, and we need to wait for > + * pending IO requests to complete or be queued for re-try. > + * Thus the number queued (nr_queued) plus this request (1) > + * must match the number of pending IOs (nr_pending) before > + * we continue. > */ > spin_lock_irq(&conf->resync_lock); > conf->barrier++; > conf->nr_waiting++; > wait_event_lock_irq(conf->wait_barrier, > - conf->barrier+conf->nr_pending == > conf->nr_queued+2, > + conf->nr_pending == conf->nr_queued+1, > conf->resync_lock, > ({ flush_pending_writes(conf); > raid1_unplug(conf->mddev->queue); })); > -- When we call freeze_array, it is after reschedule_retry, during which conf->nr_queued is already incremented. Should we use conf->nr_pending == conf->nr_pending here? -- Want an e-mail address like mine? Get a free e-mail account today at www.mail.com! -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html