On Tue, Feb 16, 2021 at 3:19 PM Brendan Jackman <jackmanb@xxxxxxxxxx> wrote: > > As pointed out by Ilya and explained in the new comment, there's a > discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads > the value from memory into r0, while x86 only does so when r0 and the > value in memory are different. The same issue affects s390. > > At first this might sound like pure semantics, but it makes a real > difference when the comparison is 32-bit, since the load will > zero-extend r0/rax. > > The fix is to explicitly zero-extend rax after doing such a > CMPXCHG. Since this problem affects multiple archs, this is done in > the verifier by patching in a BPF_ZEXT_REG instruction after every > 32-bit cmpxchg. Any archs that don't need such manual zero-extension > can do a look-ahead with insn_is_zext to skip the unnecessary mov. > > Reported-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") > Signed-off-by: Brendan Jackman <jackmanb@xxxxxxxxxx> Acked-by: KP Singh <kpsingh@xxxxxxxxxx> > --- > > Difference from v1[1]: Now solved centrally in the verifier instead of > specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions! > > [1] https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@xxxxxxxxxxxxx/T/#t > > kernel/bpf/verifier.c | 36 +++++++++++++++++++ > .../selftests/bpf/verifier/atomic_cmpxchg.c | 25 +++++++++++++ > .../selftests/bpf/verifier/atomic_or.c | 26 ++++++++++++++ > 3 files changed, 87 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 16ba43352a5f..7f4a83d62acc 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -11889,6 +11889,39 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) > return 0; > } > > +/* BPF_CMPXCHG always loads a value into R0, therefore always zero-extends. > + * However some archs' equivalent instruction only does this load when the > + * comparison is successful. So here we add a BPF_ZEXT_REG after every 32-bit > + * CMPXCHG, so that such archs' JITs don't need to deal with the issue. Archs > + * that don't face this issue may use insn_is_zext to detect and skip the added > + * instruction. > + */ > +static int add_zext_after_cmpxchg(struct bpf_verifier_env *env) > +{ > + struct bpf_insn zext_patch[2] = { [1] = BPF_ZEXT_REG(BPF_REG_0) }; I was initially confused as to why do we have 2 instructions here for the patch. > + struct bpf_insn *insn = env->prog->insnsi; > + int insn_cnt = env->prog->len; > + struct bpf_prog *new_prog; > + int delta = 0; /* Number of instructions added */ > + int i; > + > + for (i = 0; i < insn_cnt; i++, insn++) { > + if (insn->code != (BPF_STX | BPF_W | BPF_ATOMIC) || insn->imm != BPF_CMPXCHG) > + continue; > + > + zext_patch[0] = *insn; But the patch also needs to have the original instruction, so it makes sense. > + new_prog = bpf_patch_insn_data(env, i + delta, zext_patch, 2); > + if (!new_prog) > + return -ENOMEM; > + > + delta++; > + env->prog = new_prog; > + insn = new_prog->insnsi + i + delta; > + } > + > + return 0; > +} > + > static void free_states(struct bpf_verifier_env *env) > { > struct bpf_verifier_state_list *sl, *sln; > @@ -12655,6 +12688,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, > if (ret == 0) > ret = fixup_call_args(env); > > + if (ret == 0) > + ret = add_zext_after_cmpxchg(env); > + > env->verification_time = ktime_get_ns() - start_time; > print_verification_stats(env); > > diff --git a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c > index 2efd8bcf57a1..6e52dfc64415 100644 > --- a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c > +++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c > @@ -94,3 +94,28 @@ > .result = REJECT, > .errstr = "invalid read from stack", > [...] > > base-commit: 45159b27637b0fef6d5ddb86fc7c46b13c77960f > -- > 2.30.0.478.g8a0d178c01-goog >