Re: [PATCH RFC v4 net-next 01/26] net: filter: add "load 64-bit immediate" eBPF instruction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 13, 2014 at 9:08 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Wed, Aug 13, 2014 at 12:57 AM, Alexei Starovoitov <ast@xxxxxxxxxxxx> wrote:
>> add BPF_LD_IMM64 instruction to load 64-bit immediate value into register.
>> All previous instructions were 8-byte. This is first 16-byte instruction.
>> Two consecutive 'struct bpf_insn' blocks are interpreted as single instruction:
>> insn[0/1].code = BPF_LD | BPF_DW | BPF_IMM
>> insn[0/1].dst_reg = destination register
>> insn[0].imm = lower 32-bit
>> insn[1].imm = upper 32-bit
>
> This might be unnecessarily difficult for fancy static analysis tools
> to reason about.  Would it make sense to assign two different codes
> for this?  For example, insn[0].code = code_for_load_low,
> insns[1].code = code_for_load_high, along with a verifier check that
> they come in matched pairs and that code_for_load_high isn't a jump
> target?

see my reply to David for the same thing. Short answer is that
sequence of instructions (even if it is a pair of instructions like this)
is very hard to detect in verifier and JITs.
As soon as we give compiler two instructions instead of one,
compiler may optimize them in a fancy ways. Like two loads of
64-bit immediate with upper 32-bit the same, may came out as
4 instructions: load_high, load_low, load_low, mov.
Or in some cases as single load_low, etc.
load 64-bit imm has to stay as single instruction to be verifiable
and patch-able easily.
One can argue: force compiler to emit load_low and load_hi
always together, but then that's exactly what I have. It's a single insn.

> (Something else that I find confusing about eBPF: the instruction
> mnemonics are very strange.  Have you considered giving them real
> names?  For example, load.imm.low instead of BPF_LD | BPF_DW | BPF_IMM
> is easier to read and pronounce.)

BPF_LD | BPF_DW | BPF_IMM is not really a name. It's macro
for cases when instructions are generated from inside the kernel.
Instructions mnemonics are not defined yet.
llvm emits assembler code like:
bpf_prog2:
  ldw r1, 16(r1)
  std -8(r10), r1
  mov r1, 1
  std -16(r10), r1
  ld_64 r1, 1
  mov r2, r10
  addi r2, -8
  call 4
  jeqi r0, 0 goto .LBB1_2
  ldd r1, 0(r0)
  addi r1, 1
  std 0(r0), r1
.LBB1_3:
  mov r0, 0
  ret
...
I'm open to change assembler/disassembler mnemonics.
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux