Re: FW: [PATCH 008 of 9] md: Fix possible raid1/raid10 deadlock on read error during resync.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday March 24, md2sf@xxxxxxxx wrote:
> 
> > 
> > -----Original Message-----
> > From: linux-kernel-owner@xxxxxxxxxxxxxxx
> > [mailto:linux-kernel-owner@xxxxxxxxxxxxxxx] On Behalf Of NeilBrown
> > Sent: Sunday, March 02, 2008 5:18 PM
> > To: Andrew Morton
> > Cc: linux-raid@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; K.Tanaka
> > Subject: [PATCH 008 of 9] md: Fix possible raid1/raid10 deadlock on read
> > error during resync.
> > 
> 
> > diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
> > --- .prev/drivers/md/raid1.c	2008-03-03 11:03:39.000000000 +1100
> > +++ ./drivers/md/raid1.c	2008-03-03 09:56:52.000000000 +1100
> > @@ -704,13 +704,20 @@ static void freeze_array(conf_t *conf)
> >   	/* stop syncio and normal IO and wait for everything to
> >   	 * go quite.
> >   	 * We increment barrier and nr_waiting, and then
> > -	 * wait until barrier+nr_pending match nr_queued+2
> > +	 * wait until nr_pending match nr_queued+1
> > +	 * This is called in the context of one normal IO request
> > +	 * that has failed. Thus any sync request that might be pending
> > +	 * will be blocked by nr_pending, and we need to wait for
> > +	 * pending IO requests to complete or be queued for re-try.
> > +	 * Thus the number queued (nr_queued) plus this request (1)
> > +	 * must match the number of pending IOs (nr_pending) before
> > +	 * we continue.
> >   	 */
> >   	spin_lock_irq(&conf->resync_lock);
> >   	conf->barrier++;
> >   	conf->nr_waiting++;
> >   	wait_event_lock_irq(conf->wait_barrier,
> > -			    conf->barrier+conf->nr_pending ==
> > conf->nr_queued+2,
> > +			    conf->nr_pending == conf->nr_queued+1,
> >   			    conf->resync_lock,
> >   			    ({ flush_pending_writes(conf);
> >   			       raid1_unplug(conf->mddev->queue); }));
> > --
> 
> When we call freeze_array, it is after reschedule_retry, during which conf->nr_queued is already incremented.
> Should we use conf->nr_pending == conf->nr_pending here?
> 

Can I assume you mean
          conf->nr_pending == conf->nr_queued
??

Yes, it is after reschedule_retry which increments ->nr_queued, but
also after
       conf->nr_queued--;
in raid1d when the request is removed from the queue.

Does that make sense?

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux