On Wed, Mar 03, 2021 at 11:04:02AM +0000, Brendan Jackman wrote: > As pointed out by Ilya and explained in the new comment, there's a > discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads > the value from memory into r0, while x86 only does so when r0 and the > value in memory are different. The same issue affects s390. > > At first this might sound like pure semantics, but it makes a real > difference when the comparison is 32-bit, since the load will > zero-extend r0/rax. > > The fix is to explicitly zero-extend rax after doing such a > CMPXCHG. Since this problem affects multiple archs, this is done in > the verifier by patching in a BPF_ZEXT_REG instruction after every > 32-bit cmpxchg. Any archs that don't need such manual zero-extension > can do a look-ahead with insn_is_zext to skip the unnecessary mov. > > Reported-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") > Signed-off-by: Brendan Jackman <jackmanb@xxxxxxxxxx> > --- > > Note this still goes on top of Ilya's patch: > > https://lore.kernel.org/bpf/20210301154019.129110-1-iii@xxxxxxxxxxxxx/T/#u > > Differences v5->v6[1]: > - Moved is_cmpxchg_insn and ensured it can be safely re-used. Also renamed it > and removed 'inline' to match the style of the is_*_function helpers. > - Fixed up comments in verifier test (thanks for the careful review, Martin!) Acked-by: Martin KaFai Lau <kafai@xxxxxx>