On Wed, Feb 12, 2025 at 2:14 PM Peilin Ye <yepeilin@xxxxxxxxxx> wrote: > > On Mon, Feb 10, 2025 at 10:51:11PM +0000, Peilin Ye wrote: > > > > #define BPF_LOAD_ACQ 0x10 > > > > #define BPF_STORE_REL 0x20 so that was broken then, since BPF_SUB 0x10 ? And original thing was also completely broken for BPF_ATOMIC_LOAD | BPF_RELAXED == 0x10 == BPF_SUB ? so much for "lets define relaxed, acquire, release, acq_rel for completeness". :( > > > > > > why not 1 and 2 ? > > > > I just realized that we can't do 1 and 2 because BPF_ADD | BPF_FETCH > > also equals 1. > > > > > All other bits are reserved and the verifier will make sure they're zero > > > > IOW, we can't tell if imm<4-7> is reserved or BPF_ADD (0x00). What > > would you suggest? Maybe: > > > > #define BPF_ATOMIC_LD_ST 0x10 > > > > #define BPF_LOAD_ACQ 0x1 > > #define BPF_STORE_REL 0x2 This is also broken, since BPF_ATOMIC_LD_ST | BPF_LOAD_ACQ == 0x11 == BPF_SUB | BPF_FETCH. BPF_SUB | BPF_FETCH is invalid at the moment, but such aliasing is bad. > > > > ? > > Or, how about reusing 0xb in imm<4-7>: > > #define BPF_ATOMIC_LD_ST 0xb0 > > #define BPF_LOAD_ACQ 0x1 > #define BPF_STORE_REL 0x2 > > 0xb is BPF_MOV in BPFArithOp<>, and we'll never need it for BPF_ATOMIC. > Instead of moving values between registers, we now "move" values from/to > the memory - if I can think of it that way. and BPF_ATOMIC_LD_ST | BPF_LOAD_ACQ would == BPF_MOV | BPF_FETCH ? Not pretty and confusing. BPF_FETCH modifier means that "do whatever opcode says to do, like add in memory, but also return the value into insn->src_reg" Which doesn't fit this BPF_ATOMIC_LD_ST | BPF_LOAD_ACQ semantics which loads into _dst_reg_. How about: #define BPF_LOAD_ACQ 2 #define BPF_STORE_REL 3 and only use them with BPF_MOV like imm = BPF_MOV | BPF_LOAD_ACQ - is actual load acquire imm = BPF_MOV | BPF_STORE_REL - release Thought 2 stands on its own, it's also equal to BPF_ADD | BPF_LOAD_ACQ which is kinda ugly, so I don't like to use 2 alone. > Or, do we want to start to use the remaining bits of the imm field (i.e. > imm<8-31>) ? Maybe. Sort-of. Since #define BPF_CMPXCHG (0xf0 | BPF_FETCH) another option would be: #define BPF_LOAD_ACQ 0x100 #define BPF_STORE_REL 0x110 essentially extending op type to: BPF_ATOMIC_TYPE(imm) ((imm) & 0x1f0) All options are not great. I feel we need to step back. Is there an architecture that has sign extending load acquire ? Looks like arm doesn't, and I couldn't find any arch that does. Then maybe we should reconsider BPF_LDX/STX and use BPF_MODE to distinguish from normal ldx/stx #define BPF_ACQ_REL 0xe0 BPF_LDX | BPF_ACQ_REL | BPF_W BPF_STX | BPF_ACQ_REL | BPF_W ?