Re: [PATCH 6/7] bpf: Add instructions for atomic_cmpxchg and friends

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 24, 2020 at 2:55 AM Brendan Jackman <jackmanb@xxxxxxxxxx> wrote:
>
> On Mon, Nov 23, 2020 at 10:40:00PM -0800, Alexei Starovoitov wrote:
> > On Mon, Nov 23, 2020 at 05:32:01PM +0000, Brendan Jackman wrote:
> > > These are the operations that implement atomic exchange and
> > > compare-exchange.
> > >
> > > They are peculiarly named because of the presence of the separate
> > > FETCH field that tells you whether the instruction writes the value
> > > back to the src register. Neither operation is supported without
> > > BPF_FETCH:
> > >
> > > - BPF_CMPSET without BPF_FETCH (i.e. an atomic compare-and-set
> > >   without knowing whether the write was successfully) isn't implemented
> > >   by the kernel, x86, or ARM. It would be a burden on the JIT and it's
> > >   hard to imagine a use for this operation, so it's not supported.
> > >
> > > - BPF_SET without BPF_FETCH would be bpf_set, which has pretty
> > >   limited use: all it really lets you do is atomically set 64-bit
> > >   values on 32-bit CPUs. It doesn't imply any barriers.
> >
> > ...
> >
> > > -                   if (insn->imm & BPF_FETCH) {
> > > +                   switch (insn->imm) {
> > > +                   case BPF_SET | BPF_FETCH:
> > > +                           /* src_reg = atomic_chg(*(u32/u64*)(dst_reg + off), src_reg); */
> > > +                           EMIT1(0x87);
> > > +                           break;
> > > +                   case BPF_CMPSET | BPF_FETCH:
> > > +                           /* r0 = atomic_cmpxchg(*(u32/u64*)(dst_reg + off), r0, src_reg); */
> > > +                           EMIT2(0x0F, 0xB1);
> > > +                           break;
> > ...
> > >  /* atomic op type fields (stored in immediate) */
> > > +#define BPF_SET            0xe0    /* atomic write */
> > > +#define BPF_CMPSET 0xf0    /* atomic compare-and-write */
> > > +
> > >  #define BPF_FETCH  0x01    /* fetch previous value into src reg */
> >
> > I think SET in the name looks odd.
> > I understand that you picked this name so that SET|FETCH together would form
> > more meaningful combination of words, but we're not planning to support SET
> > alone. There is no such instruction in a cpu. If we ever do test_and_set it
> > would be something different.
>
> Yeah this makes sense...
>
> > How about the following instead:
> > +#define BPF_XCHG     0xe1    /* atomic exchange */
> > +#define BPF_CMPXCHG  0xf1    /* atomic compare exchange */
> > In other words get that fetch bit right away into the encoding.
> > Then the switch statement above could be:
> > +                     switch (insn->imm) {
> > +                     case BPF_XCHG:
> > +                             /* src_reg = atomic_chg(*(u32/u64*)(dst_reg + off), src_reg); */
> > +                             EMIT1(0x87);
> > ...
> > +                     case BPF_ADD | BPF_FETCH:
> > ...
> > +                     case BPF_ADD:
>
> ... Although I'm a little wary of this because it makes it very messy to
> do something like switch(BPF_OP(insn->imm)) since we'd have no name for
> BPF_OP(0xe1). That might be fine - I haven't needed such a construction
> so far (although I have used BPF_OP(insn->imm)) so maybe we wouldn't
> ever need it.
>
> What do you think? Maybe we add the `#define BPF_XCHG 0xe1` and then if we
> later need to do switch(BPF_OP(insn->imm)) we could bring back
> `#define BPF_SET 0xe` as needed?

I don't think we'll add C atomic_set any time soon.
Since kernel's atomic_set according to the kernel memory model is the
same as write_once.
Which is different from C atomic_set that is implemented in llvm as atomic_xchg
which includes the barrier. Kernel barriers are explicit.
I think eventually we may add various barriers to bpf isa, but not atomic_set.
Just like we don't add insns for read_once, write_once. The normal load/store
should be honored by JITs. So read_once/write_once == *(volatile uX *) in C
will be compiled by llvm into normal bpf ld/st and JITs have to
preserve them as-is.
Otherwise bpf memory model (when it's defined eventually) would have to
diverge too much from the kernel. I think it needs to preserve
read_once/write_once
just like the kernel. Hence no special C-like atomic_set and when JITs process
ld/st they have to emit them as single insn when possible.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux