Re: [PATCH v3 3/5] MCS Lock: Barrier corrections

Waiman Long <waiman.long@xxxxxx> · Wed, 06 Nov 2013 23:29:54 -0500



    On 11/06/2013 08:39 PM, Linus Torvalds wrote:
    
      
      Sorry about the HTML crap, the internet connection is
        too slow for my normal email habits, so I'm using my phone. 
      I think the barriers are still totally wrong for the
        locking functions.
      Adding an smp_rmb after waiting for the lock is pure
        BS. Writes in the locked region could percolate out of the
        locked region.
      The thing is, you cannot do the memory ordering for
        locks in any same generic way. Not using our current barrier
        system. On x86 (and many others) the smp_rmb will work fine,
        because writes are never moved earlier. But on other
        architectures you really need an acquire to get a lock
        efficiently. No separate barriers. An acquire needs to be on the
        instruction that does the lock.
      Same goes for unlock. On x86 any store is a fine
        unlock, but on other architectures you need a store with a
        release marker.
      So no amount of barriers will ever do this correctly.
        Sure, you can add full memory barriers and it will be "correct"
        but it will be unbearably slow, and add totally unnecessary
        serialization. So *correct* locking will require architecture
        support.
           

    Yes, we realized that we can't do it in a generic way without
    introducing unwanted overhead. So I had sent out another patch to do
    it in an architecture specific way to enable each architecture to
    choose their memory barrier. It was at the end of the v3 and v4
    patch series.

    
    -Longman