Re: [PATCH bpf-next 0/9] xdp: Support multiple programs on a single interface through chain calls

Jesper Dangaard Brouer <brouer@xxxxxxxxxx> · Thu, 3 Oct 2019 12:09:23 +0200

On Thu, 03 Oct 2019 09:48:22 +0200
Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:

> John Fastabend <john.fastabend@xxxxxxxxx> writes:
> 
> > Toke Høiland-Jørgensen wrote:  
> >> John Fastabend <john.fastabend@xxxxxxxxx> writes:
> >>   
> >> > Toke Høiland-Jørgensen wrote:  
> >> >> Alan Maguire <alan.maguire@xxxxxxxxxx> writes:
> >> >>   
> >> >> > On Wed, 2 Oct 2019, Toke Høiland-Jørgensen wrote:
> >> >> >  
> >> >> >> This series adds support for executing multiple XDP programs on a single
> >> >> >> interface in sequence, through the use of chain calls, as discussed at the Linux
> >> >> >> Plumbers Conference last month:
> >> >> >> 
> >> >> >> https://linuxplumbersconf.org/event/4/contributions/460/
> >> >> >> 
> >> >> >> # HIGH-LEVEL IDEA
> >> >> >> 
> >> >> >> The basic idea is to express the chain call sequence through a special map type,
> >> >> >> which contains a mapping from a (program, return code) tuple to another program
> >> >> >> to run in next in the sequence. Userspace can populate this map to express
> >> >> >> arbitrary call sequences, and update the sequence by updating or replacing the
> >> >> >> map.
> >> >> >> 
> >> >> >> The actual execution of the program sequence is done in bpf_prog_run_xdp(),
> >> >> >> which will lookup the chain sequence map, and if found, will loop through calls
> >> >> >> to BPF_PROG_RUN, looking up the next XDP program in the sequence based on the
> >> >> >> previous program ID and return code.
> >> >> >> 
> >> >> >> An XDP chain call map can be installed on an interface by means of a new netlink
> >> >> >> attribute containing an fd pointing to a chain call map. This can be supplied
> >> >> >> along with the XDP prog fd, so that a chain map is always installed together
> >> >> >> with an XDP program.
> >> >> >>   
> >> >> >
> >> >> > This is great stuff Toke!  
> >> >> 
> >> >> Thanks! :)
> >> >>   
> >> >> > One thing that wasn't immediately clear to me - and this may be just
> >> >> > me - is the relationship between program behaviour for the XDP_DROP
> >> >> > case and chain call execution. My initial thought was that a program
> >> >> > in the chain XDP_DROP'ping the packet would terminate the call chain,
> >> >> > but on looking at patch #4 it seems that the only way the call chain
> >> >> > execution is terminated is if
> >> >> >
> >> >> > - XDP_ABORTED is returned from a program in the call chain; or  
> >> >> 
> >> >> Yes. Not actually sure about this one...
> >> >>   
> >> >> > - the map entry for the next program (determined by the return value
> >> >> >   of the current program) is empty; or  
> >> >> 
> >> >> This will be the common exit condition, I expect
> >> >>   
> >> >> > - we run out of entries in the map  
> >> >> 
> >> >> You mean if we run the iteration counter to zero, right?
> >> >>   
> >> >> > The return value of the last-executed program in the chain seems to be
> >> >> > what determines packet processing behaviour after executing the chain
> >> >> > (_DROP, _TX, _PASS, etc). So there's no way to both XDP_PASS and
> >> >> > XDP_TX a packet from the same chain, right? Just want to make sure
> >> >> > I've got the semantics correct. Thanks!  
> >> >> 
> >> >> Yeah, you've got all this right. The chain call mechanism itself doesn't
> >> >> change any of the underlying fundamentals of XDP. I.e., each packet gets
> >> >> exactly one verdict.
> >> >> 
> >> >> For chaining actual XDP programs that do different things to the packet,
> >> >> I expect that the most common use case will be to only run the next
> >> >> program if the previous one returns XDP_PASS. That will make the most
> >> >> semantic sense I think.
> >> >> 
> >> >> But there are also use cases where one would want to match on the other
> >> >> return codes; such as packet capture, for instance, where one might
> >> >> install a capture program that would carry forward the previous return
> >> >> code, but do something to the packet (throw it out to userspace) first.
> >> >> 
> >> >> For the latter use case, the question is if we need to expose the
> >> >> previous return code to the program when it runs. You can do things
> >> >> without it (by just using a different program per return code), but it
> >> >> may simplify things if we just expose the return code. However, since
> >> >> this will also change the semantics for running programs, I decided to
> >> >> leave that off for now.
> >> >> 
> >> >> -Toke  
> >> >
> >> > In other cases where programs (e.g. cgroups) are run in an array the
> >> > return codes are 'AND'ed together so that we get
> >> >
> >> >    result1 & result2 & ... & resultN  

But the XDP return codes are not bit values, so AND operation doesn't
make sense to me.

> >> 
> >> How would that work with multiple programs, though? PASS -> DROP seems
> >> obvious, but what if the first program returns TX? Also, programs may
> >> want to be able to actually override return codes (e.g., say you want to
> >> turn DROPs into REDIRECTs, to get all your dropped packets mirrored to
> >> your IDS or something).  
> >
> > In general I think either you hard code a precedence that will have to
> > be overly conservative because if one program (your firewall) tells
> > XDP to drop the packet and some other program redirects it, passes,
> > etc. that seems incorrect to me. Or you get creative with the
> > precedence rules and they become complex and difficult to manage,
> > where a drop will drop a packet unless a previous/preceding program
> > redirects it, etc. I think any hard coded precedence you come up with
> > will make some one happy and some other user annoyed. Defeating the
> > programability of BPF.  
> 
> Yeah, exactly. That's basically why I punted on that completely.
> Besides, technically you can get this by just installing different
> programs in each slot if you really need it.

I would really like to avoid hard coding precedence.  I know it is
"challenging" that we want to allow overruling any XDP return code, but
I think it makes sense and it is the most flexible solution.

> > Better if its programmable. I would prefer to pass the context into
> > the next program then programs can build their own semantics. Then
> > leave the & of return codes so any program can if needed really drop a
> > packet. The context could be pushed into a shared memory region and
> > then it doesn't even need to be part of the program signature.  
> 
> Since it seems I'll be going down the rabbit hole of baking this into
> the BPF execution environment itself, I guess I'll keep this in mind as
> well. Either by stuffing the previous program return code into the
> context object(s), or by adding a new helper to retrieve it.

I would like to see the ability to retrieve previous program return
code, and a new helper would be the simplest approach.  As this could
potentially simplify and compact the data-structure.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer