Re: [patch 6/8] raid5: make_request use batch stripe release

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 8 Jun 2012 14:16:57 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:

> On Thu, Jun 07, 2012 at 03:58:16PM +0800, Shaohua Li wrote:
> > On Thu, Jun 07, 2012 at 05:33:10PM +1000, NeilBrown wrote:
> > > On Thu, 7 Jun 2012 14:33:58 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:
> > > 
> > > > On Thu, Jun 07, 2012 at 11:23:45AM +1000, NeilBrown wrote:
> > > > > On Mon, 04 Jun 2012 16:01:58 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:
> > > > > 
> > > > > > make_request() does stripe release for every stripe and the stripe usually has
> > > > > > count 1, which makes previous release_stripe() optimization not work. In my
> > > > > > test, this release_stripe() becomes the heaviest pleace to take
> > > > > > conf->device_lock after previous patches applied.
> > > > > > 
> > > > > > Below patch makes stripe release batch. When maxium strips of a batch reach,
> > > > > > the batch will be flushed out. Another way to do the flush is when unplug is
> > > > > > called.
> > > > > > 
> > > > > > Signed-off-by: Shaohua Li <shli@xxxxxxxxxxxx>
> > > > > 
> > > > > I like the idea of a batched release.
> > > > > I don't like the per-cpu variables... and I don't think it is safe to only
> > > > > allocate them for_each_present_cpu without support cpu-hot-plug.
> > > > > 
> > > > > I would much rather keep a list of stripes (linked on ->lru) in struct
> > > > > md_plug_cb (or maybe in some structure which contains that) and release them
> > > > > all on unplug - and only on unplug.
> > > > > 
> > > > > Maybe pass a size to mddev_check_unplugged, and it allocates that much more
> > > > > space.  Get mddev_check_unplugged to return the md_plug_cb structure.
> > > > > If the new space is NULL, then list_head_init it, and change the cb.callback
> > > > > to a raid5 specific function.
> > > > > Then add any stripe to the md_plug_cb, and in the unplug function, release
> > > > > them all.
> > > > > 
> > > > > Does that make sense?
> > > > > 
> > > > > Also I would rather the batched stripe release code were defined in the same
> > > > > patch that used it.  It isn't big enough to justify a separate patch.
> > > > 
> > > > The stripe->lru need protection of device_lock, so I can't use a list. An array
> > > > is preferred. I really didn't like the idea to allocate memory especially when
> > > > allocating an array. I'll fix the code for cpuhotplug.
> > > 
> > > You don't need device_lock to use ->lru.
> > > Currently the lru is not used when sh->count is not-zero unless
> > > STRIPE_EXPANDING is set - and we never attach IO requests if STRIPE_EXPANDING
> > > is set.
> > > So when make_request wants to release a stripe_head, ->lru is currently
> > > unused.
> > > So we can use it to put the stripe on a per-thread list without locking.
> > > 
> > > We need another stripe_head flag to say "is on a per-thread unplug list" to
> > > avoid racing between processes, but we don't need a spinlock for that.
> > > ie.
> > >   if (!test_and_set(STRIPE_ON_UNPLUG_LIST, &sh->state))
> > >            list_add(&plug->list, &sh->lru);
> > > 
> > > or similar.
> > 
> > I did see some BUG_ON trigger when I access ->lru without device_lock hold
> > before, for example get_active_stripe will remove it from list. Maybe can use
> > the same bit to avoid it. Let me try.
> 
> Thinking a bit more, the STRIPE_ON_UNPLUG_LIST bit can't avoid races. For
> example,
> Task 1 hit a stripe, assume stripe count 0 (could not be 0 too):
> it does:
> 1. inc count
> 2. set STRIPE_ON_UNPLUG_LIST
> 3. add stripe to plug list
> 4. unplug to release the stripe
> Between 3 and 4, task 2 hit the stripe, it does:
> A: inc count. Since the bit set, do nothing more
> B: unplug
> If the order is 3, A, 4, B. Task1 will not release the stripe, since the
> count is 2. Tasks2 will not release the stripe, since stripe isn't in its
> list. The stripe will never be handled.

"Since the bit set, do nothing" isn't correct - we need to release the
reference.
So it should be
   if (!test_and_set_bit(STRIPE_ON_UNPLUG_LIST, &sh->state))
          list_add(&sh->lru, &plug->list);
   else
          release_stripe(&sh);

We expect that in most cases release_stripe will just decrement the counter
and not need to take the lock.
Then the unplug takes the lock and calls __release_stripe() on all the
stripes.  So the stripe always gets released, either immediately or
at unplug.

So my initial attempt at a code fragment was incomplete.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux