Re: [PATCH bpf-next 6/7] libbpf: BPF Static Keys support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 10, 2023 at 2:30 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> How about a slightly different modification of the Anton's idea.
> Suppose that, as before, there is a special map type:
>
>     struct {
>         __uint(type, BPF_MAP_TYPE_ARRAY);
>         __type(key, __u32);
>         __type(value, __u32);
>         __uint(map_flags, BPF_F_STATIC_KEY);
>         __uint(max_entries, 1);
>     } skey1 SEC(".maps")

Instead of special map that the kernel has to know about
the same intent can be expressed with:
int skey1;
r0 = %[skey1] ll;
and then the kernel needs no extra map type while the user space
can collect all static_branches that use &skey1 by
iterating insn stream and comparing addresses.

> Which is used as below:
>
>     __attribute__((naked))
>     int foo(void) {
>       asm volatile (
>                     "r0 = %[skey1] ll;"
>                     "if r0 != r0 goto 1f;"
>                     "r1 = r10;"
>                     "r1 += -8;"
>                     "r2 = 1;"
>                     "call %[bpf_trace_printk];"
>             "1:"
>                     "exit;"
>                     :: __imm_addr(skey1),
>                        __imm(bpf_trace_printk)
>                     : __clobber_all
>       );
>     }
>
> Disassembly of section .text:
>
> 0000000000000000 <foo>:
>        0:   r0 = 0x0 ll
>         0000000000000000:  R_BPF_64_64  skey1  ;; <---- Map relocation as usual
>        2:   if r0 == r0 goto +0x4 <foo+0x38>   ;; <---- Note condition
>        3:   r1 = r10
>        4:   r1 += -0x8
>        5:   r2 = 0x1
>        6:   call 0x6
>        7:   exit
>
> And suppose that verifier is modified in the following ways:
> - treat instructions "if rX == rX" / "if rX != rX" (when rX points to
>   static key map) in a special way:
>   - when program is verified, the jump is considered non deterministic;
>   - when program is jitted, the jump is compiled as nop for "!=" and as
>     unconditional jump for "==";
> - build a table of static keys based on a specific map referenced in
>   condition, e.g. for the example above it can be inferred that insn 2
>   associates with map skey1 because "r0" points to "skey1";
> - jit "rX = <static key> ll;" as nop;
>
> On the plus side:
> - any kinds of jump tables are omitted from system call;
> - no new instruction is needed;
> - almost no modifications to libbpf are necessary (only a helper macro
>   to convince clang to keep "if rX == rX");

Reusing existing insn means that we're giving it new meaning
and that always comes with danger of breaking existing progs.
In this case if rX == rX isn't very meaningful and new semantics
shouldn't break anything, but it's a danger zone.

If we treat:
if r0 == r0
as JA
then we have to treat
if r1 == r1
as JA as well and it becomes ambiguous when prog_info needs
to return the insns back to user space.

If we go with rX == rX approach we should probably limit it
to one specific register. r0, r10, r11 can be considered
and they have their own pros and cons.

Additional:
r0 = %[skey1] ll
in front of JE/JNE is a waste. If we JIT it to useless native insn
we will be burning cpu for no reason. So we should probably
optimize it out. If we do so, then this inline insn becomes a nop and
it's effectively a relocation. The insn stream will carry this
rX = 64bit_const insn to indicate the scope of the next insn.
It's pretty much like Anton's idea of using extra bits in JA
to encode an integer key_id.
With ld_imm64 we will encode 64-bit key_id.
Another insn with more bits to burn that has no effect on execution.

It doesn't look clean to encode so much extra metadata into instructions
that JITs and the interpreter have to ignore.
If we go this route:
  r11 = 64bit_const
  if r11 == r11 goto
is a lesser evil.
Still, it's not as clean as JA with extra bits in src_reg.
We already optimize JA +0 into a nop. See opt_remove_nops().
So a flavor of JA insn looks the most natural fit for a selectable
JA +xx or JA +0.

And the special map really doesn't fit.
Whatever we do, let's keep text_poke-able insn logic separate
from bookkeeping of addresses of those insns.
I think a special prefixed section that is understood by libbpf
(like what I proposed with "name.static_branch") will do fine.
If it's not good enough we can add a "set" map type
that will be a generic set of values.
It can be a set of 8-byte addresses to keep locations of static_branches,
but let's keep it generic.
I think it's fine to add:
__uint(type, BPF_MAP_TYPE_SET)
and let libbpf populate it with addresses of insns,
or address of variables, or other values
when it prepares a program for loading.
But map_update_elem should never be doing text_poke on insns.
We added prog_array map type is the past, but that was done
during the early days. If we were designing bpf today we would have
gone a different route.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux