Re: [PATCH 1/1] selftests/bpf: add cls_redirect classifier

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Tue, 5 May 2020 18:30:02 -0700

On Tue, May 5, 2020 at 6:37 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> On 5/5/20 1:48 AM, Alexei Starovoitov wrote:
> > On Sat, May 02, 2020 at 01:48:51AM +0200, Daniel Borkmann wrote:
> >> On 4/27/20 11:45 AM, Lorenz Bauer wrote:
> >>> On Sun, 26 Apr 2020 at 18:33, Alexei Starovoitov
> >>> <alexei.starovoitov@xxxxxxxxx> wrote:
> >> [...]
> >>>>> +/* Linux packet pointers are either aligned to NET_IP_ALIGN (aka 2 bytes),
> >>>>> + * or not aligned if the arch supports efficient unaligned access.
> >>>>> + *
> >>>>> + * Since the verifier ensures that eBPF packet accesses follow these rules,
> >>>>> + * we can tell LLVM to emit code as if we always had a larger alignment.
> >>>>> + * It will yell at us if we end up on a platform where this is not valid.
> >>>>> + */
> >>>>> +typedef uint8_t *net_ptr __attribute__((align_value(8)));
> >>>>
> >>>> Wow. I didn't know about this attribute.
> >>>> I wonder whether it can help Daniel's memcpy hack.
> >>>
> >>> Yes, I think so.
> >>
> >> Just for some more context [0]. I think the problem is a bit more complex in
> >> general. Generally, _any_ kind of pointer to some data (except for the stack)
> >> is currently treated as byte-by-byte copy from __builtin_memcpy() and other
> >> similarly available __builtin_*() helpers on BPF backend since the backend
> >> cannot make any assumptions about the data's alignment and whether unaligned
> >> access from the underlying arch is ok & efficient (the latter the verifier
> >> does judge for us however). So it's definitely not just limited to xdp->data.
> >> There is also the issue that while access to any non-stack data can be
> >> unaligned, access to the stack however cannot. I've discussed a while back
> >> with Yonghong about potential solutions. One would be to add a small patch
> >> to the BPF backend to enable __builtin_*() helpers to allow for unaligned
> >> access which could then be opt-ed in e.g. via -mattr from llc for the case
> >> when we know that the compiled program only runs on archs with efficient
> >> unaligned access anyway. However, this still potentially breaks with the BPF
> >> stack for the case when objects are, for example, larger than size 8 but with
> >> a natural alignment smaller than 8 where __builtin_memcpy() would then decide
> >> to emit dw-typed load/stores. But for these cases could then be annotated via
> >> __aligned(8) on stack. So this is basically what we do right now as a generic
> >> workaround in Cilium [0], meaning, our own memcpy/memset with optimal number
> >> of instructions and __aligned(8) where needed; most of the time this __aligned(8)
> >> is not needed, so it's really just a few places, and we also have a cocci
> >> scripts to catch these during development if needed. Anyway, real thing would
> >> be to allow the BPF stack for unaligned access as well and then BPF backend
> >> could nicely solve this in a native way w/o any workarounds, but that is tbd.
> >>
> >> Thanks,
> >> Daniel
> >>
> >>    [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/builtins.h
> >
> > Daniel,
> > do you mind adding such memcpy to libbpf ?
>
> We could do that, yeah. Is there a way from BPF C code when compiling with clang to
> get to the actual underlying architecture (x86-64, arm64, ppc, etc) when compiling
> with `-target bpf` so that we can always fall back to __builtin_*() for those where
> verifier would bail out on unaligned access? Keep in mind the __bpf_memcpy() and
> __bpf_memzero() from [0] are fully compile time resolved and I did the implementation
> for sizes of 1,2,4,..., 96 where the latter is in two' increments, so no odd buffer
> sizes as we don't need them in our code. If someone does hit such an odd case, then
> I'm currently throwing a compilation error via __throw_build_bug(). Latter is a nice
> way to be able to guarantee that a switch/case or if condition is never hit during
> compilation time. It resolves to __builtin_trap() which is not implemented in the
> BPF backend and therefore yells to the developer when built into the code (this has
> a nice property which wouldn't work with BUILD_BUG_ON() for example). Anyway, what
> I'm saying is that either we'd need the full thing with all sizes or document that
> unsupported size would be hit when __builtin_trap() assertion is seen.

I think it would be fine to simply document it.
Most structures have at least one 'int' and don't have 'packed',
so they are multiple of 4 typically. Multiple of 2 limitation should be
acceptable for most cases.