RE: [Bpf] BPF ISA conformance groups

dthaler1968@xxxxxxxxxxxxxx · Tue, 12 Dec 2023 14:01:09 -0800

David Vernet <void@xxxxxxxxxxxxx> wrote: 
[...]
> > > > A given runtime platform would be compliant to some set of the
> > > > following conformance groups:
> > > >
> > > > 1. "basic": all instructions not covered by another group below.
> > > > 2. "atomic": all Atomic operations.  I think Christoph argued for
> > > > this one in the meeting.
> > > > 3. "divide": all division and modulo operations.  Alexei said in
> > > > the meeting that he'd heard demand for this one.
> > > > 4. "legacy": all legacy packet access instructions (deprecated).
> > > > 5. "map": 64-bit immediate instructions that deal with map fds or
> > > > map indices.
> > > > 6. "code": 64-bit immediate instruction that has a "code pointer" type.
> > > > 7. "func": program-local functions.
> > >
> > > I thought for a while about whether this should be part of the basic
> > > conformance group, and talked through it with Jakub Kicinski. I do
> > > think it makes sense to keep it separate like this. For e.g. devices
> > > with Harvard architectures, it could get quite non-trivial for the
> > > verifier to determine whether accesses to arguments stored in
> > > special register are safe. Definitely not impossible, and overall
> > > very useful to support this, but in order to ease vendor adoption
> > > it's probably best to keep this separate.
> > >
> > > > Things that I *think* don't need a separate conformance group (can
> > > > just be in "basic") include:
> > > > a. Call helper function by address or BTF ID.  A runtime that
> > > > doesn't support these simply won't expose any
> > > >     such helper functions to BPF programs.
> > > > b. Platform variable instructions (dst = var_addr(imm)).  A
> > > > runtime that doesn't support this simply won't
> > > >     expose any platform variables to BPF programs.
> > > >
> > > > Comments? (Let the bikeshedding begin...)
> > >
> > > This list seems logical to me,
> >
> > I think we should do just two categories: legacy and the rest, since
> > any scheme will be flawed and infinite bikeshedding will ensue.
> 
> If we do this, then aren't we forcing every vendor that adds BPF support to
> support every single instruction if they want to be compliant?

Right, indeed I think it could hinder BPF adoption if we required every
single instruction in any hardware offload card that wanted to add BPF support.

> > For example, let's take a look at #2 atomic...
> > Should it include or exclude atomic_add insn ? It was added at the
> > very beginning of BPF ISA and was used from day one.
> > Without it it's impossible to count stats. The typical network or
> > tracing use case needs to count events and one cannot do it without
> > atomic increment. Eventually per-cpu maps were added as an alternative.
> > I suspect any platform that supports #1 basic insn without atomic_add
> > will not be practically useful.
> > Should atomic_add be a part of "basic" then? But it's atomic.
> > Then what about atomic_fetch_add insn? It's pretty close semantically.
> > Part of atomic or part of basic?
> 
> I think it's reasonable to expect that if you require an atomic add, that you
> may also require the other atomic instructions as well and that it would be
> logical to group them together, yes. I believe that Netronome supports all of
> the atomic instructions, as one example. If you're providing a BPF runtime in
> an environment where atomic adds are required, I think it stands to reason
> that you should probably support the other atomics as well, no?

I agree.

> > Another example, #3 divide. bpf cpu=v1 ISA only has unsigned div/mod.
> > Eventually we added a signed version. Integer division is one of the
> > slowest operations in a HW. Different cpus have different flavors of
> > them 64/32 64/64 32/32, etc. All with different quirks.
> > cpu=v1 had modulo insn because in tracing one often needs to do it to
> > select a slot in a table, but in networking there is rarely a need.
> > So bpf offload into netronome HW doesn't support it (iirc).
> 
> Correct, my understanding is that BPF offload in netronome supports neither
> division nor modulo.

In my opinion, this is a valid technical reason to put them into a separate
conformance group, to allow hardware offload cards to support BPF without
requiring division/modulo which they might not have space or other budget for.

> > Should div/mod signed/unsigned be a part of basic? or separate?
> > Only 32 or 64 bit?
> >
> > Hence my point: legacy and the rest (as of cpu=v4) are the only two
> > categories we should have in _this_ version of the standard.
> > Rest assured we will add new insn in the coming months.
> > I suggest we figure out conformance groups for future insns at that time.
> > That would be the time to argue and actually extract value out of
> discussion.
> > Retroactive bike shedding is a bike shedding and nothing else.
> 
> I wouldn't personally categorize this as retroactive _bikeshedding_.

_ bikeshedding_ typically refers to spending a disproportionate amount of
time on trivial matters.  The discussion here isn't about a trivial matter,
in my view, it's about encouraging adoption of BPF in standardized ways
that can be reasoned about and strike a reasonable balance between
platform complexity/cost and user-perceivable complexity.

> What we're trying to do is retroactive _grouping_, and I think what you're
> really arguing for is that grouping in general isn't necessary.
> So I think we should maybe take a step back and talk about what value it
> brings at a higher level to determine if the complexity / ambiguity of grouping
> is worth the benefit.
> 
> From my perspective, the reason that we want conformance groups is purely
> for compliance and cross compatibility. If someone has a BPF program that
> does some class of operations, then conformance groups inform them about
> whether their prog will be able to run on some vendor implementation of
> BPF.  For example, if you're doing network packet filtering and doing some
> atomics, hashing, etc on a Netronome NIC, you'd like for it to be able to run
> on other NICs that implement offload as well. If a NIC isn't compliant with the
> atomics group, it won't be able to support your prog.
> 
> If we don't have conformance groups, I don't see how we can provide that
> guarantee. I think there's essentially a 0% chance that vendors will implement
> every instruction; nor should they have to. So if we just do "legacy" and
> "other", the grouping won't really tell a vendor or BPF developer anything
> other than what instructions are completely useless and should be avoided.

+1 to all of above.

> If we want to get rid of conformance groups that's fine and I do think there's
> an argument for it, but I think we need to discuss this in terms of compliance
> and not the grouping aspect.
> 
> FWIW, my perspective is that we should be aiming to enable compliance.
> I don't see any reason why a BPF prog that's offloaded to a NIC to do packet
> filtering shouldn't be able to e.g. run on multiple devices.
> That certainly won't be the case for every type of BPF program, but classifying
> groups of instructions does seem prudent.

I agree with David's assessment above.

Dave