On Thu, Jul 06, 2017 at 02:12:24PM +0000, David Laight wrote: > From: Paul E. McKenney > > Sent: 06 July 2017 00:30 > > There is no agreed-upon definition of spin_unlock_wait()'s semantics, > > and it appears that all callers could do just as well with a lock/unlock > > pair. This series therefore removes spin_unlock_wait() and changes > > its users to instead use a lock/unlock pair. The commits are as follows, > > in three groups: > > > > 1-7. Change uses of spin_unlock_wait() and raw_spin_unlock_wait() > > to instead use a spin_lock/spin_unlock pair. These may be > > applied in any order, but must be applied before any later > > commits in this series. The commit logs state why I believe > > that these commits won't noticeably degrade performance. > > I can't help feeling that it might be better to have a spin_lock_sync() > call that is equivalent to a spin_lock/spin_unlock pair. > The default implementation being an inline function that does exactly that. > This would let an architecture implement a more efficient version. So that has the IRQ inversion issue found earlier in this patch-set. Not actually doing the acquire though breaks other guarantees. See later. > It might even be that this is the defined semantics of spin_unlock_wait(). As is, spin_unlock_wait() is somewhat ill defined. IIRC it grew from an optimization by Oleg and subsequently got used elsewhere. And it being the subtle bugger it is, there were bugs. But part of the problem with that definition is fairness. For fair locks, spin_lock()+spin_unlock() partakes in the fairness protocol. Many of the things that would make spin_lock_sync() cheaper preclude it doing that. (with the exception of ticket locks, those could actually do this). But I think we can argue we don't in fact want that, all we really need is to ensure the completion of the _current_ lock. But then you've violated that equivalent thing. As is, this is all a bit up in the air -- some of the ticket lock variants are fair or minimal, the qspinlock on is prone to starvation (although I think I can in fact fix that for qspinlock). > Note that it can only be useful to do a spin_lock/unlock pair if it is > impossible for another code path to try to acquire the lock. > (Or, at least, the code can't care if the lock is acquired just after.) > So if it can de determined that the lock isn't held (a READ_ONCE() > might be enough) the lock itself need not be acquired (with the > associated slow bus cycles). Now look at the ARM64/PPC implementations that do explicit stores. READ_ONCE() can only ever be sufficient on strongly ordered architectures, but given many performance critical architectures are in fact weakly ordered, you've opened up the exact can of worms we want to get rid of the thing for. So given: (1) spin_lock() spin_unlock() does not in fact provide memory ordering. Any prior/subsequent load/stores can leak into the section and cross there. What the thing does do however is serialize against other critical sections in that: (2) CPU0 CPU1 spin_lock() spin_lock() X = 5 spin_unlock() spin_unlock() r = X; we must have r == 5 (due to address dependency on the lock; CPU1 does the store to X, then a store-release to the lock. CPU0 then does a load-acquire on the lock and that fully orders the subsequent load of X to the prior store of X). The other way around is more difficult though: (3) CPU0 CPU1 X=5 spin_lock() spin_lock() spin_unlock() r = X; spin_unlock() Where the above will in fact observe r == 5, this will be very difficult to achieve with anything that will not let CPU1 wait. Which was the entire premise of the original optimization by Oleg. One of the later fixes we did to spin_unlock_wait() is to give it ACQUIRE semantics to deal with (2) but even that got massively tricky, see for example the ARM64 / PPC implementations. In short, I doubt there is any real optimization possible if you want to retain exact lock+unlock semantics. Now on the one hand I feel like Oleg that it would be a shame to loose the optimization, OTOH this thing is really really tricky to use, and has lead to a number of bugs already. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html