Re: [patch 3/3] raid5: relieve lock contention in get_active_stripe()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 10 Sep 2013 12:24:38 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:

> On Tue, Sep 10, 2013 at 02:06:29PM +1000, NeilBrown wrote:
> > On Tue, 10 Sep 2013 10:35:55 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:
> > 
> > > On Tue, Sep 10, 2013 at 11:13:18AM +1000, NeilBrown wrote:
> > > > On Mon, 9 Sep 2013 12:33:18 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:
> > > > >  		} else {
> > > > > +			spin_lock(&conf->device_lock);
> > > > > +
> > > > >  			if (atomic_read(&sh->count)) {
> > > > >  				BUG_ON(!list_empty(&sh->lru)
> > > > >  				    && !test_bit(STRIPE_EXPANDING, &sh->state)
> > > > > @@ -611,13 +725,14 @@ get_active_stripe(struct r5conf *conf, s
> > > > >  					sh->group = NULL;
> > > > >  				}
> > > > >  			}
> > > > > +			spin_unlock(&conf->device_lock);
> > > > 
> > > > The device_lock is only really needed in the 'else' branch of the if
> > > > statement.  So can we have it only there.  i.e. don't take the lock if
> > > > sh->count is non-zero.
> > > 
> > > This is correct, I assume this isn't worthy optimizing before. Will fix soon.
> > 
> > It isn't really about optimising performance.  It is about making the code
> > easier to understand.  If we keep the region covered by the lock as small as
> > reasonably possible, it makes it more obvious to the reader which values are
> > being protected.
> > 
> >  
> > > > > -	spin_lock_irqsave(&conf->device_lock, flags);
> > > > > +	lock_all_device_hash_locks_irqsave(conf, &flags);
> > > > >  	clear_bit(In_sync, &rdev->flags);
> > > > >  	mddev->degraded = calc_degraded(conf);
> > > > > -	spin_unlock_irqrestore(&conf->device_lock, flags);
> > > > > +	unlock_all_device_hash_locks_irqrestore(conf, &flags);
> > > > >  	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> > > > 
> > > > Why do you think you need to take all the hash locks here and elsewhere when
> > > > ->degraded is set?
> > > > The lock is only need to ensure that the 'In_sync' flags are consistent with
> > > > the 'degraded' count.
> > > > ->degraded isn't used in get_active_stripe so I cannot see how it is relevant
> > > > to the hash locks.
> > > > 
> > > > We need to lock everything in raid5_quiesce().  I don't think we need to
> > > > anywhere else.
> > > 
> > > init_stripe() accesses some filelds, don't need to protect?
> > 
> > What fields?  Not ->degraded.
> > 
> > I think the fields that it accesses are effectively protected by the new
> > seqlock.
> > If you don't think so, please be explicit.
> 
> Like raid_disks, previous_raid_disks, chunk_sectors, prev_chunk_sectors,
> algorithm and so on. They are used in raid5_compute_sector(), stripe_set_idx()
> and init_stripe(). The former two are called by init_stripe().

Yes.  Those are only changed in raid5_start_reshape() and are protected by
conf->gen_lock.

If they change while init_stripe is running, the read_seqcount_retry() call in
make_request() will notice the inconsistency, release the stripe, and try
again.

I guess we probably need an extra check on gen_lock inside init_stripe().
i.e. a
  do {
     seq = read_seqcount_begin(&conf->gen_lock);

just after the "remove_hash(sh)", and a

  } while (read_seqcount_retry(&conf->gen_lock, seq));

just before the "insert_hash(sh)".  That will ensure the stripe inserted into
the hash is consistent.  The read_seqcount_retry() in make_request is still
needed to ensure that the correct stripe_head is used.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux