Patch "riscv, bpf: Make BPF_CMPXCHG fully ordered" has been added to the 6.6-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Tue, 22 Oct 2024 13:49:53 -0400

This is a note to let you know that I've just added the patch titled

    riscv, bpf: Make BPF_CMPXCHG fully ordered

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     riscv-bpf-make-bpf_cmpxchg-fully-ordered.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 8a732a04f6626a6c3aba840a08ac7f8ca03889ed
Author: Andrea Parri <parri.andrea@xxxxxxxxx>
Date:   Thu Oct 17 17:36:28 2024 +0300

    riscv, bpf: Make BPF_CMPXCHG fully ordered
    
    [ Upstream commit e59db0623f6955986d1be0880b351a1f56e7fd6d ]
    
    According to the prototype formal BPF memory consistency model
    discussed e.g. in [1] and following the ordering properties of
    the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
    with the BPF_CMPXCHG modifier is fully ordered.  However, the
    current RISC-V JIT lowerings fail to meet such memory ordering
    property.  This is illustrated by the following litmus test:
    
    BPF BPF__MP+success_cmpxchg+fence
    {
     0:r1=x; 0:r3=y; 0:r5=1;
     1:r2=y; 1:r4=f; 1:r7=x;
    }
     P0                               | P1                                         ;
     *(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
     r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
                                      | r6 = *(u64 *)(r7 + 0)                      ;
    exists (1:r1=1 /\ 1:r6=0)
    
    whose "exists" clause is not satisfiable according to the BPF
    memory model.  Using the current RISC-V JIT lowerings, the test
    can be mapped to the following RISC-V litmus test:
    
    RISCV RISCV__MP+success_cmpxchg+fence
    {
     0:x1=x; 0:x3=y; 0:x5=1;
     1:x2=y; 1:x4=f; 1:x7=x;
    }
     P0                 | P1                          ;
     sd x5, 0(x1)       | ld x1, 0(x2)                ;
     L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
     lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
     bne x2, x4, L01    |                             ;
     sc.d x6, x5, 0(x3) |                             ;
     bne x6, x4, L00    |                             ;
     fence rw, rw       |                             ;
     L01:               |                             ;
    exists (1:x1=1 /\ 1:x6=0)
    
    where the two stores in P0 can be reordered.  Update the RISC-V
    JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
    RELEASE ("rl") annotation in order to meet the expected memory
    ordering guarantees.  The resulting RISC-V JIT lowerings of
    BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().
    
    Other lowerings were fixed via 20a759df3bba ("riscv, bpf: make
    some atomic operations fully ordered").
    
    Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64")
    Signed-off-by: Andrea Parri <parri.andrea@xxxxxxxxx>
    Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
    Reviewed-by: Puranjay Mohan <puranjay@xxxxxxxxxx>
    Acked-by: Björn Töpel <bjorn@xxxxxxxxxx>
    Link: https://lpc.events/event/18/contributions/1949/attachments/1665/3441/bpfmemmodel.2024.09.19p.pdf [1]
    Link: https://lore.kernel.org/bpf/20241017143628.2673894-1-parri.andrea@xxxxxxxxx
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index 2f041b5cea970..26eeb39736319 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -555,8 +555,8 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64,
 		     rv_lr_w(r0, 0, rd, 0, 0), ctx);
 		jmp_offset = ninsns_rvoff(8);
 		emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx);
-		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 0) :
-		     rv_sc_w(RV_REG_T3, rs, rd, 0, 0), ctx);
+		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) :
+		     rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx);
 		jmp_offset = ninsns_rvoff(-6);
 		emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx);
 		emit(rv_fence(0x3, 0x3), ctx);