On Tue, Jun 12, 2012 at 2:02 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > On Wed, Jun 6, 2012 at 11:52 PM, Shaohua Li <shli@xxxxxxxxxx> wrote: >> On Thu, Jun 07, 2012 at 04:35:22PM +1000, NeilBrown wrote: >>> On Thu, 7 Jun 2012 14:29:39 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote: >>> >>> > On Thu, Jun 07, 2012 at 10:54:10AM +1000, NeilBrown wrote: >>> > > On Mon, 04 Jun 2012 16:01:53 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote: >>> > > >>> > > > Add a per-stripe lock to protect stripe specific data, like dev->read, >>> > > > written, ... The purpose is to reduce lock contention of conf->device_lock. >>> > > >>> > > I'm not convinced that you need to add a lock. >>> > > I am convinced that if you do add one you need to explain exactly what it is >>> > > protecting. >>> > > >>> > > The STRIPE_ACTIVE bit serves as a lock and ensures that only one process can >>> > > be in handle_stripe at a time. >>> > > So I don't think dev->read actually needs any protection (though I haven't >>> > > checked thoroughly). >>> > > >>> > > I think the only things that device_lock protects are things shared by >>> > > multiple stripes, so adding a per-stripe spinlock isn't going to help remove >>> > > device_lock. >>> > >>> > This sounds not true to me. both the async callbacks and request completion >>> > access stripe data, like dev->read. Such things are not protected by >>> > STRIPE_ACTIVE bit. Thought we can delete STRIPE_ACTIVE bit with stripe lock >>> > introduced. >>> >>> Please give specifics. What race do you see with access to dev->read that is >>> not protected by STRIPE_ACTIVE ? >> >> For example, ops_complete_biofill() will change dev->read which isn't protected >> by STRIPE_ACTIVE. add_stripe_bio() checks ->toread ->towrite, which isn't >> protected by the bit too. Am I missing anything? > > STRIPE_ACTIVE is the replacement for the old per-stripe lock. That > lock never was meant/able to synchronize add_stripe_bio() vs ops_run_* > (producer vs consumer). That's always been device_lock's job because > an individual bio may be added to several stripes. If device_lock is > gone we need a different scheme. That's what tripped me up last time > I looked at this. Actually now that I look at add_stripe_bio again, I think it could be made to work if: 1/ bi_phys_segments is incremented prior to publishing the bio on to{read|write} otherwise we potentially race with a consumer without a reference 2/ making sure the overlap checking does not walk off into invalid bios as it may do once we no longer have a global lock Outside of that we need a lock for making sure bi->next fields are updated properly and two threads only collide there on the same stripe. But I need to think more about the other usages. -- Dan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html