On Tue, 16 Feb 2021 at 20:55, Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote: > > On Tue, 2021-02-16 at 14:19 +0000, Brendan Jackman wrote: > > As pointed out by Ilya and explained in the new comment, there's a > > discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads > > the value from memory into r0, while x86 only does so when r0 and the > > value in memory are different. The same issue affects s390. > > > > At first this might sound like pure semantics, but it makes a real > > difference when the comparison is 32-bit, since the load will > > zero-extend r0/rax. > > > > The fix is to explicitly zero-extend rax after doing such a > > CMPXCHG. Since this problem affects multiple archs, this is done in > > the verifier by patching in a BPF_ZEXT_REG instruction after every > > 32-bit cmpxchg. Any archs that don't need such manual zero-extension > > can do a look-ahead with insn_is_zext to skip the unnecessary mov. > > > > Reported-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > > Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") > > Signed-off-by: Brendan Jackman <jackmanb@xxxxxxxxxx> > > --- > > > > Difference from v1[1]: Now solved centrally in the verifier instead > > of > > specifically for the x86 JIT. Thanks to Ilya and Daniel for the > > suggestions! > > > > [1] > > https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@xxxxxxxxxxxxx/T/#t > > > > kernel/bpf/verifier.c | 36 > > +++++++++++++++++++ > > .../selftests/bpf/verifier/atomic_cmpxchg.c | 25 +++++++++++++ > > .../selftests/bpf/verifier/atomic_or.c | 26 ++++++++++++++ > > 3 files changed, 87 insertions(+) > > I tried this with my s390 atomics patch, and it's working, thanks! > > I was thinking whether this could go through the existing zext_dst > flag infrastructure, but it probably won't play too nicely with the > x86_64 JIT, which doesn't override bpf_jit_needs_zext(). Ah right, I actually didn't understand what the opt_subreg_zext_lo32_rnd_hi32 was doing until now so didn't consider this. But yeah I think cmpxchg is properly special here because the zext is sometimes (e.g. on x86_64) needed even on architectures that don't _generally_ need explicit zext. I think I'll update some comments to reflect these learnings, thanks. > Acked-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > Tested-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > > [...] >