On Wed, Feb 17, 2021 at 7:30 PM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> wrote: > > On Wed, 2021-02-17 at 09:28 +0000, Brendan Jackman wrote: > > As pointed out by Ilya and explained in the new comment, there's a > > discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads > > the value from memory into r0, while x86 only does so when r0 and the > > value in memory are different. The same issue affects s390. > > > > At first this might sound like pure semantics, but it makes a real > > difference when the comparison is 32-bit, since the load will > > zero-extend r0/rax. > > > > The fix is to explicitly zero-extend rax after doing such a > > CMPXCHG. Since this problem affects multiple archs, this is done in > > the verifier by patching in a BPF_ZEXT_REG instruction after every > > 32-bit cmpxchg. Any archs that don't need such manual zero-extension > > can do a look-ahead with insn_is_zext to skip the unnecessary mov. > > > > Reported-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > > Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") > > Signed-off-by: Brendan Jackman <jackmanb@xxxxxxxxxx> > > --- > > > > Differences v2->v3[1]: > > - Moved patching into fixup_bpf_calls (patch incoming to rename this > > function) > > - Added extra commentary on bpf_jit_needs_zext > > - Added check to avoid adding a pointless zext(r0) if there's > > already one there. > > > > Difference v1->v2[1]: Now solved centrally in the verifier instead of > > specifically for the x86 JIT. Thanks to Ilya and Daniel for the > > suggestions! > > > > [1] v2: > > https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@xxxxxxxxxxxxx/T/#t > > v1: > > https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@xxxxxxxxxxxxx/T/#t > > > > kernel/bpf/core.c | 4 +++ > > kernel/bpf/verifier.c | 26 > > +++++++++++++++++++ > > .../selftests/bpf/verifier/atomic_cmpxchg.c | 25 > > ++++++++++++++++++ > > .../selftests/bpf/verifier/atomic_or.c | 26 > > +++++++++++++++++++ > > 4 files changed, 81 insertions(+) > > [...] > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 16ba43352a5f..a0d19be13558 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -11662,6 +11662,32 @@ static int fixup_bpf_calls(struct > > bpf_verifier_env *env) > > continue; > > } > > > > + /* BPF_CMPXCHG always loads a value into R0, > > therefore always > > + * zero-extends. However some archs' equivalent > > instruction only > > + * does this load when the comparison is successful. > > So here we > > + * add a BPF_ZEXT_REG after every 32-bit CMPXCHG, so > > that such > > + * archs' JITs don't need to deal with the issue. > > Archs that > > + * don't face this issue may use insn_is_zext to > > detect and skip > > + * the added instruction. > > + */ > > + if (insn->code == (BPF_STX | BPF_W | BPF_ATOMIC) && > > insn->imm == BPF_CMPXCHG) { > > + struct bpf_insn zext_patch[2] = { [1] = > > BPF_ZEXT_REG(BPF_REG_0) }; > > + > > + if (!memcmp(&insn[1], &zext_patch[1], > > sizeof(struct bpf_insn))) > > + /* Probably done by > > opt_subreg_zext_lo32_rnd_hi32. */ > > + continue; > > + > > Isn't opt_subreg_zext_lo32_rnd_hi32() called after fixup_bpf_calls()? Indeed, this check should not be needed. > > [...] >