Re: [PATCH bpf-next 6/7] libbpf: BPF Static Keys support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 09, 2023 at 10:32:42PM -0800, Yonghong Song wrote:
> 
> On 12/9/23 9:18 AM, Alexei Starovoitov wrote:
> > On Fri, Dec 8, 2023 at 9:05 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
> > > 
> > > On 12/8/23 8:25 PM, Alexei Starovoitov wrote:
> > > > On Fri, Dec 8, 2023 at 8:15 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
> > > > > On 12/8/23 8:05 PM, Alexei Starovoitov wrote:
> > > > > > On Fri, Dec 8, 2023 at 2:04 PM Andrii Nakryiko
> > > > > > <andrii.nakryiko@xxxxxxxxx> wrote:
> > > > > > > I feel like embedding some sort of ID inside the instruction is very..
> > > > > > > unusual, shall we say?
> > > > > > yeah. no magic numbers inside insns pls.
> > > > > > 
> > > > > > I don't like JA_CFG name, since I read CFG as control flow graph,
> > > > > > while you probably meant CFG as configurable.
> > > > > > How about BPF_JA_OR_NOP ?
> > > > > > Then in combination with BPF_JMP or BPF_JMP32 modifier
> > > > > > the insn->off|imm will be used.
> > > > > > 1st bit in src_reg can indicate the default action: nop or jmp.
> > > > > > In asm it may look like asm("goto_or_nop +5")
> > > > > How does the C source code looks like in order to generate
> > > > > BPF_JA_OR_NOP insn? Any source examples?
> > > > It will be in inline asm only. The address of that insn will
> > > > be taken either via && or via asm (".long %l[label]").
> > > >   From llvm pov both should go through the same relo creation logic. I hope :)
> > > A hack in llvm below with an example, could you check whether the C
> > > syntax and object dump result
> > > is what you want to see?
> > Thank you for the ultra quick llvm diff!
> > 
> > > diff --git a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
> > > b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
> > > index 90697c6645be..38b1cbc31f9a 100644
> > > --- a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
> > > +++ b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
> > > @@ -231,6 +231,7 @@ public:
> > >            .Case("call", true)
> > >            .Case("goto", true)
> > >            .Case("gotol", true)
> > > +        .Case("goto_or_nop", true)
> > >            .Case("*", true)
> > >            .Case("exit", true)
> > >            .Case("lock", true)
> > > @@ -259,6 +260,7 @@ public:
> > >            .Case("bswap64", true)
> > >            .Case("goto", true)
> > >            .Case("gotol", true)
> > > +        .Case("goto_or_nop", true)
> > >            .Case("ll", true)
> > >            .Case("skb", true)
> > >            .Case("s", true)
> > > diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td
> > > b/llvm/lib/Target/BPF/BPFInstrInfo.td
> > > index 5972c9d49c51..a953d10429bf 100644
> > > --- a/llvm/lib/Target/BPF/BPFInstrInfo.td
> > > +++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
> > > @@ -592,6 +592,19 @@ class BRANCH<BPFJumpOp Opc, string OpcodeStr,
> > > list<dag> Pattern>
> > >      let BPFClass = BPF_JMP;
> > >    }
> > > 
> > > +class BRANCH_OR_NOP<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
> > > +    : TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
> > > +                   (outs),
> > > +                   (ins brtarget:$BrDst),
> > > +                   !strconcat(OpcodeStr, " $BrDst"),
> > > +                   Pattern> {
> > > +  bits<16> BrDst;
> > > +
> > > +  let Inst{47-32} = BrDst;
> > > +  let Inst{31-0} = 1;
> > > +  let BPFClass = BPF_JMP;
> > > +}
> > > +
> > >    class BRANCH_LONG<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
> > >        : TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
> > >                       (outs),
> > > @@ -632,6 +645,7 @@ class CALLX<string OpcodeStr>
> > >    let isBranch = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1 in {
> > >      def JMP : BRANCH<BPF_JA, "goto", [(br bb:$BrDst)]>;
> > >      def JMPL : BRANCH_LONG<BPF_JA, "gotol", []>;
> > > +  def JMP_OR_NOP : BRANCH_OR_NOP<BPF_JA, "goto_or_nop", []>;
> > I was thinking of burning the new 0xE opcode for it,
> > but you're right. It's a flavor of existing JA insn and it's indeed
> > better to just use src_reg=1 bit to indicate so.
> 
> Right, using src_reg to indicate a new flavor of JA insn sounds
> a good idea. My previously-used 'imm' field is a pure hack.
> 
> > 
> > We probably need to use the 2nd bit of src_reg to indicate its default state
> > (jmp or fallthrough).
> 
> Good point.
> 
> > 
> > >           asm volatile goto ("r0 = 0; \
> > >                               goto_or_nop %l[label]; \
> > >                               r2 = 2; \
> > >                               r3 = 3; \
> > Not sure how to represent the default state in assembly though.
> > "goto_or_nop" defaults to goto
> > "nop_or_goto" default to nop
> > ?
> > 
> > Do we need "gotol" for imm32 or will it be automatic?
> 
> It won't be automatic.
> 
> At the end of this email, I will show the new change
> to have gotol_or_nop and nop_or_gotol insn and an example

Thanks a lot Yonghong! May I ask you to send a full patch for LLVM
(with gotol) so that I can test it?

Overall, I think that JA + flags in SRC_REG is indeed better than a
new instruction, as a new code is not used.

This looks for me that two bits aren't enough, and the third is
required, as the second bit seems to be overloaded:

  * bit 1 indicates that this is a "JA_MAYBE"
  * bit 2 indicates a jump or nop (i.e., the current state)

However, we also need another bit which indicates what to do with the
instruction when we issue [an abstract] command

  flip_branch_on_or_off(branch, 0/1)

Without this information (and in the absense of external meta-data on
how to patch the branch) we can't determine what a given (BPF, not
jitted) program currently does. For example, if we issue

  flip_branch_on_or_off(branch, 0)

then we can't reflect this in the xlated program by setting the second
bit to jmp/off. Again, JITted program is fine, but it will be
desynchronized from xlated in term of logic (some instructions will be
mapped as NOP -> x86_JUMP, others as NOP -> x86_NOP).

In my original patch we kept this triplet as

  (offset to indicate a "special jump", JA+0/JA+OFF, Normal/Inverse)

> to show it in asm. But there is an issue here.
> In my example, the compiler (more specifically
> the InstCombine pass) moved some code after
> the 'label' to before the 'label'. Not exactly
> sure how to prevent this. Maybe current
> 'asm goto' already have a way to handle
> this. Will investigate this later.
> 
> 
> =========================
> 
> $ cat t.c
> int bar(void);
> int foo1()
> {
>         int a, b;
>         asm volatile goto ("r0 = 0; \
>                             gotol_or_nop %l[label]; \
>                             r2 = 2; \
>                             r3 = 3; \
>                            "::::label);
>         a = bar();
> label:
>         b = 20 * a;
>         return b;
> }
> int foo2()
> {
>         int a, b;
>         asm volatile goto ("r0 = 0; \
>                             nop_or_gotol %l[label]; \
>                             r2 = 2; \
>                             r3 = 3; \
>                            "::::label);
>         a = bar();
> label:
>         b = 20 * a;
>         return b;
> }
> $ clang --target=bpf -O2 -g -c t.c
> $ llvm-objdump -S t.o
> 
> t.o:    file format elf64-bpf
> 
> Disassembly of section .text:
> 
> 0000000000000000 <foo1>:
> ; {
>        0:       b7 00 00 00 00 00 00 00 r0 = 0x0
> ;       asm volatile goto ("r0 = 0; \
>        1:       b7 00 00 00 00 00 00 00 r0 = 0x0
>        2:       06 10 00 00 04 00 00 00 gotol_or_nop +0x4 <LBB0_2>
>        3:       b7 02 00 00 02 00 00 00 r2 = 0x2
>        4:       b7 03 00 00 03 00 00 00 r3 = 0x3
> ;       a = bar();
>        5:       85 10 00 00 ff ff ff ff call -0x1
> ;       b = 20 * a;
>        6:       27 00 00 00 14 00 00 00 r0 *= 0x14
> 
> 0000000000000038 <LBB0_2>:
> ;       return b;
>        7:       95 00 00 00 00 00 00 00 exit
> 
> 0000000000000040 <foo2>:
> ; {
>        8:       b7 00 00 00 00 00 00 00 r0 = 0x0
> ;       asm volatile goto ("r0 = 0; \
>        9:       b7 00 00 00 00 00 00 00 r0 = 0x0
>       10:       06 20 00 00 04 00 00 00 nop_or_gotol +0x4 <LBB1_2>
>       11:       b7 02 00 00 02 00 00 00 r2 = 0x2
>       12:       b7 03 00 00 03 00 00 00 r3 = 0x3
> ;       a = bar();
>       13:       85 10 00 00 ff ff ff ff call -0x1
> ;       b = 20 * a;
>       14:       27 00 00 00 14 00 00 00 r0 *= 0x14
> 
> 0000000000000078 <LBB1_2>:
> ;       return b;
>       15:       95 00 00 00 00 00 00 00 exit
> 




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux