On Thu, 2013-11-07 at 11:59 -0800, Michel Lespinasse wrote: > On Thu, Nov 7, 2013 at 6:31 AM, Paul E. McKenney > <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > > On Thu, Nov 07, 2013 at 04:50:23AM -0800, Michel Lespinasse wrote: > >> On Thu, Nov 7, 2013 at 4:06 AM, Linus Torvalds > >> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > >> > > >> > On Nov 7, 2013 6:55 PM, "Michel Lespinasse" <walken@xxxxxxxxxx> wrote: > >> >> > >> >> Rather than writing arch-specific locking code, would you agree to > >> >> introduce acquire and release memory operations ? > >> > > >> > Yes, that's probably the right thing to do. What ops do we need? Store with > >> > release, cmpxchg and load with acquire? Anything else? > >> > >> Depends on what lock types we want to implement on top; for MCS we would need: > >> - xchg acquire (common case) and load acquire (for spinning on our > >> locker's wait word) > >> - cmpxchg release (when there is no next locker) and store release > >> (when writing to the next locker's wait word) > >> > >> One downside of the proposal is that using a load acquire for spinning > >> puts the memory barrier within the spin loop. So this model is very > >> intuitive and does not add unnecessary barriers on x86, but it my > >> place the barriers in a suboptimal place for architectures that need > >> them. > > > > OK, I will bite... Why is a barrier in the spinloop suboptimal? > > It's probably not a big deal - all I meant to say is that if you were > manually placing barriers, you would probably put one after the loop > instead. I don't deal much with architectures where such barriers are > needed, so I don't know for sure if the difference means much. We could do a load acquire at the end of the spin loop in the lock function and not in the spin loop itself if cost of barrier within spin loop is a concern. Michel, are you planning to do an implementation of load-acquire/store-release functions of various architectures? Or is the approach of arch specific memory barrier for MCS an acceptable one before load-acquire and store-release are available? Are there any technical issues remaining with the patchset after including including Waiman's arch specific barrier? Tim > > > Can't say that I have tried measuring it, but the barrier should not > > normally result in interconnect traffic. Given that the barrier is > > required anyway, it should not affect lock-acquisition latency. > > Agree > > > So what am I missing here? > > I think you read my second email as me trying to shoot down a proposal > - I wasn't, as I really like the acquire/release model and find it > easy to program with, which is why I'm proposing it in the first > place. I just wanted to be upfront about all potential downsides, so > we can consider them and see if they are significant - I don't think > they are, but I'm not the best person to judge that as I mostly just > deal with x86 stuff. > -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html