On 19/04/21 10:26, Peter Zijlstra wrote:
On Mon, Apr 19, 2021 at 09:53:06AM +0200, Paolo Bonzini wrote:
On 19/04/21 09:32, Peter Zijlstra wrote:
On Sat, Apr 17, 2021 at 04:51:58PM +0200, Paolo Bonzini wrote:
On 16/04/21 09:09, Peter Zijlstra wrote:
Well, the obvious example would be seqlocks. C11 can't do them
Sure it can. C11 requires annotating with (the equivalent of) READ_ONCE all
reads of seqlock-protected fields, but the memory model supports seqlocks
just fine.
How does that help?
IIRC there's two problems, one on each side the lock. On the write side
we have:
seq++;
smp_wmb();
X = r;
Y = r;
smp_wmb();
seq++;
Which C11 simply cannot do right because it does't have wmb.
It has atomic_thread_fence(memory_order_release), and
atomic_thread_fence(memory_order_acquire) on the read side.
https://godbolt.org/z/85xoPxeE5
void writer(void)
{
atomic_store_explicit(&seq, seq+1, memory_order_relaxed);
atomic_thread_fence(memory_order_acquire);
This needs to be memory_order_release. The only change in the resulting
assembly is that "dmb ishld" becomes "dmb ish", which is not as good as
the "dmb ishst" you get from smp_wmb() but not buggy either.
The read side can use "dmb ishld" so it gets the same code as Linux.
LWN needs a "C11 memory model for kernel folks" article. In the
meanwhile there is
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0124r4.html
which is the opposite (Linux kernel memory model for C11 folks).
Paolo
X = 1;
Y = 2;
atomic_store_explicit(&seq, seq+1, memory_order_release);
}
gives:
writer:
adrp x1, .LANCHOR0
add x0, x1, :lo12:.LANCHOR0
ldr w2, [x1, #:lo12:.LANCHOR0]
add w2, w2, 1
str w2, [x0]
dmb ishld
ldr w1, [x1, #:lo12:.LANCHOR0]
mov w3, 1
mov w2, 2
stp w3, w2, [x0, 4]
add w1, w1, w3
stlr w1, [x0]
ret
Which, afaict, is completely buggered. What it seems to be doing is
turning the seq load into a load-acquire, but what we really need is to
make sure the seq store (increment) is ordered before the other stores.