Re: [RFC PATCH bpf-next 2/4] bpf: introduce BPF dispatcher

Toke Høiland-Jørgensen <thoiland@xxxxxxxxxx> · Thu, 14 Nov 2019 13:31:15 +0100

Björn Töpel <bjorn.topel@xxxxxxxxx> writes:

> From: Björn Töpel <bjorn.topel@xxxxxxxxx>
>
> The BPF dispatcher builds on top of the BPF trampoline ideas;
> Introduce bpf_arch_text_poke() and (re-)use the BPF JIT generate
> code. The dispatcher builds a dispatch table for XDP programs, for
> retpoline avoidance. The table is a simple binary search model, so
> lookup is O(log n). Here, the dispatch table is limited to four
> entries (for laziness reason -- only 1B relative jumps :-P). If the
> dispatch table is full, it will fallback to the retpoline path.

So it's O(log n) with n == 4? Have you compared the performance of just
doing four linear compare-and-jumps? Seems to me it may not be that big
of a difference for such a small N?

> An example: A module/driver allocates a dispatcher. The dispatcher is
> shared for all netdevs. Each netdev allocate a slot in the dispatcher
> and a BPF program. The netdev then uses the dispatcher to call the
> correct program with a direct call (actually a tail-call).

Is it really accurate to call it a tail call? To me, that would imply
that it increments the tail call limit counter and all that? Isn't this
just a direct jump using the trampoline stuff?

-Toke