Re: Do we need to correct barriering in circular-buffers.rst?

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Thu, 19 Sep 2019 08:59:23 -0700

On Thu, Sep 19, 2019 at 6:59 AM David Howells <dhowells@xxxxxxxxxx> wrote:
>
> But I don't agree with this.  You're missing half the barriers.  There should
> be *four* barriers.  The document mandates only 3 barriers, and uses
> READ_ONCE() where the fourth should be, i.e.:
>
>    thread #1            thread #2
>
>                         smp_load_acquire(head)
>                         ... read data from queue ..
>                         smp_store_release(tail)
>
>    READ_ONCE(tail)
>    ... add data to queue ..
>    smp_store_release(head)

The document is right, but you shouldn't do this.

The reason that READ_ONCE() is possible - instead of a
smp_load_acquire() - is that there's now an address dependency chain
from the READ_ONCE to the subsequent writes of the data.

And while there isn't any barrier, a data or control dependency to a
_write_ does end up ordering things (even on alpha - it's only the
read->read dependencies that might be unordered on alpha).

But again, don't do this.

Also, you ignored the part where I told you to not do this because we
already  have locking.

I'm not goign to discuss this further. Locking works. Spinlocks are
cheap. Lockless algorithms that need atomics aren't even cheaper than
spinlocks: they can in fact scale *worse*, because they don't have the
nice queuing optimization that our spinlock have.

Lockless algorithms are great if they can avoid the contention on the
lock and instead only work on distributed data and avoid contention
entirely.

But in this case the lock would be right next to the data anyway, so
even that case doesn't hold.

               Linus