Re: [patch 03/10 v3] raid5: add a per-stripe lock

NeilBrown <neilb@xxxxxxx> · Mon, 2 Jul 2012 17:39:53 +1000

On Mon, 2 Jul 2012 11:16:26 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:

> > Then I could see what is being added and what is being removed all in the one
> > patch and I can be sure that they balance.
> 
> reworked the patch 3-5 to two patches as you suggested, and sent to you. please check.

Thanks.  That's looking really good.

However I think we can do better.  I've been looking more closely at the code
and I think that the only things that we need stripe_lock to protect are
->toread and ->towrite, together with the following bios.  e.g.
->toread->bi_next etc.

->read and ->written don't need stripe_lock protection, as they are only
manipulated by the handle_stripe state machine which uses STRIPE_ACTIVE 
and refcounts for exclusion.

So add_stripe_bio need to take the lock while adding a bio to the
->toread and  ->towrite lists, and ops_run_biodrain() and ops_run_biofill
need to take the lock while the move the list from ->to{read,write} to
->{read,written}.  But we don't need it anywhere else.  e.g. analyse_stripe
shouldn't need the lock at all.  Any change that could happen during the loop
could equally happen after the lock was released so we don't lose by not
having the lock.

There is another current user of the lock, but I think that should be
discarded as a false optimisation.
We currently try to optimise out extra calls to bitmap_startwrite and
bitmap_endwrite when we see back-to-back writes to the one stripe.  However I
suspect that is extremely unlikely and it just imposes and pointless need for
synchronisation in raid5.

We just simply call bitmap_startwrite whenever ->towrite changes from NULL to
non-NULL, and call bitmap_endwrite whenever we clear ->written, reguardless
what value ->towrite now has.

Would you like to experiment with that?  If I haven't described it well
enough I can write a patch to show what I mean.

Thanks,
NeilBrown
Attachment:
signature.asc

Description: PGP signature