On Mon, Nov 30, 2020 at 7:51 PM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 11/30/20 9:22 AM, Yonghong Song wrote: > > > > > > On 11/28/20 5:40 PM, Alexei Starovoitov wrote: > >> On Fri, Nov 27, 2020 at 09:53:05PM -0800, Yonghong Song wrote: > >>> > >>> > >>> On 11/27/20 9:57 AM, Brendan Jackman wrote: > >>>> Status of the patches > >>>> ===================== > >>>> > >>>> Thanks for the reviews! Differences from v1->v2 [1]: > >>>> > >>>> * Fixed mistakes in the netronome driver > >>>> > >>>> * Addd sub, add, or, xor operations > >>>> > >>>> * The above led to some refactors to keep things readable. (Maybe I > >>>> should have just waited until I'd implemented these before starting > >>>> the review...) > >>>> > >>>> * Replaced BPF_[CMP]SET | BPF_FETCH with just BPF_[CMP]XCHG, which > >>>> include the BPF_FETCH flag > >>>> > >>>> * Added a bit of documentation. Suggestions welcome for more places > >>>> to dump this info... > >>>> > >>>> The prog_test that's added depends on Clang/LLVM features added by > >>>> Yonghong in > >>>> https://reviews.llvm.org/D72184 > >>>> > >>>> This only includes a JIT implementation for x86_64 - I don't plan to > >>>> implement JIT support myself for other architectures. > >>>> > >>>> Operations > >>>> ========== > >>>> > >>>> This patchset adds atomic operations to the eBPF instruction set. The > >>>> use-case that motivated this work was a trivial and efficient way to > >>>> generate globally-unique cookies in BPF progs, but I think it's > >>>> obvious that these features are pretty widely applicable. The > >>>> instructions that are added here can be summarised with this list of > >>>> kernel operations: > >>>> > >>>> * atomic[64]_[fetch_]add > >>>> * atomic[64]_[fetch_]sub > >>>> * atomic[64]_[fetch_]and > >>>> * atomic[64]_[fetch_]or > >>> > >>> * atomic[64]_[fetch_]xor > >>> > >>>> * atomic[64]_xchg > >>>> * atomic[64]_cmpxchg > >>> > >>> Thanks. Overall looks good to me but I did not check carefully > >>> on jit part as I am not an expert in x64 assembly... > >>> > >>> This patch also introduced atomic[64]_{sub,and,or,xor}, similar to > >>> xadd. I am not sure whether it is necessary. For one thing, > >>> users can just use atomic[64]_fetch_{sub,and,or,xor} to ignore > >>> return value and they will achieve the same result, right? > >>> From llvm side, there is no ready-to-use gcc builtin matching > >>> atomic[64]_{sub,and,or,xor} which does not have return values. > >>> If we go this route, we will need to invent additional bpf > >>> specific builtins. > >> > >> I think bpf specific builtins are overkill. > >> As you said the users can use atomic_fetch_xor() and ignore > >> return value. I think llvm backend should be smart enough to use > >> BPF_ATOMIC | BPF_XOR insn without BPF_FETCH bit in such case. > >> But if it's too cumbersome to do at the moment we skip this > >> optimization for now. > > > > We can initially all have BPF_FETCH bit as at that point we do not > > have def-use chain. Later on we can add a > > machine ssa IR phase and check whether the result of, say > > atomic_fetch_or(), is used or not. If not, we can change the > > instruction to atomic_or. > > Just implemented what we discussed above in llvm: > https://reviews.llvm.org/D72184 > main change: > 1. atomic_fetch_sub (and later atomic_sub) is gone. llvm will > transparently transforms it to negation followed by > atomic_fetch_add or atomic_add (xadd). Kernel can remove > atomic_fetch_sub/atomic_sub insns. > 2. added new instructions for atomic_{and, or, xor}. > 3. for gcc builtin e.g., __sync_fetch_and_or(), if return > value is used, atomic_fetch_or will be generated. Otherwise, > atomic_or will be generated. Great, this means that all existing valid uses of __sync_fetch_and_add() will generate BPF_XADD instructions and will work on old kernels, right? If that's the case, do we still need cpu=v4? The new instructions are *only* going to be generated if the user uses previously unsupported __sync_fetch_xxx() intrinsics. So, in effect, the user consciously opts into using new BPF instructions. cpu=v4 seems like an unnecessary tautology then?