Hi, I'm implementing BPF_CMPXCHG for the s390x JIT and noticed that the doc, the interpreter and the X64 JIT do not agree on what the behavior should be when the values match. If the operand size is BPF_W, this matters, because, depending on the answer, the upper bits of R0 are either zeroed out out or are left intact. I made the experiment based on the following modification to the "atomic compare-and-exchange smoketest - 32bit" test on top of commit ee5cc0363ea0: --- a/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c +++ b/tools/testing/selftests/bpf/verifier/atomic_cmpxchg.c @@ -57,8 +57,8 @@ BPF_MOV32_IMM(BPF_REG_1, 4), BPF_MOV32_IMM(BPF_REG_0, 3), BPF_ATOMIC_OP(BPF_W, BPF_CMPXCHG, BPF_REG_10, BPF_REG_1, -4), - /* if (old != 3) exit(4); */ - BPF_JMP32_IMM(BPF_JEQ, BPF_REG_0, 3, 2), + /* if ((u64)old != 3) exit(4); */ + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 3, 2), BPF_MOV32_IMM(BPF_REG_0, 4), BPF_EXIT_INSN(), /* if (val != 4) exit(5); */ and got the following results: 1) Documentation: Upper bits of R0 are zeroed out - but it looks as if there is a typo and either a period or the word "otherwise" is missing? > If they match it is replaced with ``src_reg``, The value that was > there before is loaded back to ``R0``. 2) Interpreter + KVM: Upper bits of R0 are zeroed out (C semantics) 3) X64 JIT + KVM: Upper bits of R0 are untouched (cmpxchg semantics) => 0xffffffffc0146bc7: lock cmpxchg %edi,-0x4(%rbp) 0xffffffffc0146bcc: cmp $0x3,%rax (gdb) p/x $rax 0x6bd5720c00000003 (gdb) x/d $rbp-4 0xffffc90001263d5c: 3 0xffffffffc0146bc7: lock cmpxchg %edi,-0x4(%rbp) => 0xffffffffc0146bcc: cmp $0x3,%rax (gdb) p/x $rax 0x6bd5720c00000003 4) X64 JIT + TCG: Upper bits of R0 are zeroed out (qemu bug?) => 0xffffffffc01441fc: lock cmpxchg %edi,-0x4(%rbp) 0xffffffffc0144201: cmp $0x3,%rax (gdb) p/x $rax 0x81776ea600000003 (gdb) x/d $rbp-4 0xffffc90001117d5c: 3 0xffffffffc01441fc: lock cmpxchg %edi,-0x4(%rbp) => 0xffffffffc0144201: cmp $0x3,%rax (gdb) p/x $rax $3 = 0x3 So which option is the right one? In my opinion, it would be safer to follow what the interpreter does and zero out the upper bits. Best regards, Ilya