Re: [PATCH v6 4/5] MCS Lock: Barrier corrections

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 28, 2013 at 06:53:41PM +0000, Will Deacon wrote:
> On Thu, Nov 28, 2013 at 06:27:12PM +0000, Paul E. McKenney wrote:
> > On Thu, Nov 28, 2013 at 06:03:18PM +0000, Will Deacon wrote:
> > > Hmm, without horrible hacks to keep track of whether we've done an
> > > mb__before_spinlock() without a matching spinlock(), that's going to end up
> > > with full-barrier + pointless half-barrier (similarly on the release path).
> > 
> > We should be able to detect mb__before_spinlock() without a matching
> > spinlock via static analysis, right?
> > 
> > Or am I missing your point?
> 
> See below...
> 
> > > > Yes, we might need better names, but I believe that this approach does
> > > > what you need.
> > > > 
> > > > Thoughts?
> > > 
> > > I still think we need to draw the distinction between ordering all accesses
> > > against a lock and ordering an unlock against a lock. The latter is free for
> > > arm64 (STLR => LDAR is ordered) but the former requires a DMB.
> > > 
> > > Not sure I completely got your drift...
> > 
> > Here is what I am suggesting:
> > 
> > o	mb__before_spinlock():
> > 
> > 	o	Must appear immediately before a lock acquisition.
> > 	o	Upgrades a lock acquisition to a full barrier.
> > 	o	Emits DMB on ARM64.
> 
> Ok, so that then means that:
> 
> 	mb__before_spinlock();
> 	spin_lock();
> 
> on ARM64 expands to:
> 
> 	dmb	ish
> 	ldaxr	...
> 
> so there's a redundant half-barrier there. If we want to get rid of that, we
> need mb__before_spinlock() to set a flag, then we could conditionalise
> ldaxr/ldxr but it's really horrible and you have to deal with interrupts
> etc. so in reality we just end up having extra barriers.

Given that there was just a dmb, how much does the ish &c really hurt?
Would the performance difference be measurable at the system level?

> Or we have separate a spin_lock_mb() function.

And mutex_lock_mb().  And spin_lock_irqsave_mb().  And spin_lock_irq_mb().
And...

Admittedly this is not yet a problem given the current very low usage
of smp_mb__before_spinlock(), but the potential for API explosion is
non-trivial.

That said, if the effect on ARM64 is measurable at the system level, I
won't stand in the way of the additional APIs.

> > o	mb_after_spinlock():
> > 
> > 	o	Must appear immediatly after a lock acquisition.
> > 	o	Upgrades an unlock+lock pair to a full barrier.
> > 	o	Emits a no-op on ARM64, as in "do { } while (0)".
> > 	o	Might need a separate flavor for queued locks on
> > 		some platforms, but no sign of that yet.
> 
> Ok, so mb__after_spinlock() doesn't imply a full barrier but
> mb__before_spinlock() does? I think people will get that wrong :)

As I said earlier in the thread, I am open to better names.

How about smp_mb__after_spin_unlock_lock_pair()?  That said, I am sure that
I could come up with something longer given enough time.  ;-)

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]