On Thu, 03 Oct 2019 09:48:22 +0200 Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > John Fastabend <john.fastabend@xxxxxxxxx> writes: > > > Toke Høiland-Jørgensen wrote: > >> John Fastabend <john.fastabend@xxxxxxxxx> writes: > >> > >> > Toke Høiland-Jørgensen wrote: > >> >> Alan Maguire <alan.maguire@xxxxxxxxxx> writes: > >> >> > >> >> > On Wed, 2 Oct 2019, Toke Høiland-Jørgensen wrote: > >> >> > > >> >> >> This series adds support for executing multiple XDP programs on a single > >> >> >> interface in sequence, through the use of chain calls, as discussed at the Linux > >> >> >> Plumbers Conference last month: > >> >> >> > >> >> >> https://linuxplumbersconf.org/event/4/contributions/460/ > >> >> >> > >> >> >> # HIGH-LEVEL IDEA > >> >> >> > >> >> >> The basic idea is to express the chain call sequence through a special map type, > >> >> >> which contains a mapping from a (program, return code) tuple to another program > >> >> >> to run in next in the sequence. Userspace can populate this map to express > >> >> >> arbitrary call sequences, and update the sequence by updating or replacing the > >> >> >> map. > >> >> >> > >> >> >> The actual execution of the program sequence is done in bpf_prog_run_xdp(), > >> >> >> which will lookup the chain sequence map, and if found, will loop through calls > >> >> >> to BPF_PROG_RUN, looking up the next XDP program in the sequence based on the > >> >> >> previous program ID and return code. > >> >> >> > >> >> >> An XDP chain call map can be installed on an interface by means of a new netlink > >> >> >> attribute containing an fd pointing to a chain call map. This can be supplied > >> >> >> along with the XDP prog fd, so that a chain map is always installed together > >> >> >> with an XDP program. > >> >> >> > >> >> > > >> >> > This is great stuff Toke! > >> >> > >> >> Thanks! :) > >> >> > >> >> > One thing that wasn't immediately clear to me - and this may be just > >> >> > me - is the relationship between program behaviour for the XDP_DROP > >> >> > case and chain call execution. My initial thought was that a program > >> >> > in the chain XDP_DROP'ping the packet would terminate the call chain, > >> >> > but on looking at patch #4 it seems that the only way the call chain > >> >> > execution is terminated is if > >> >> > > >> >> > - XDP_ABORTED is returned from a program in the call chain; or > >> >> > >> >> Yes. Not actually sure about this one... > >> >> > >> >> > - the map entry for the next program (determined by the return value > >> >> > of the current program) is empty; or > >> >> > >> >> This will be the common exit condition, I expect > >> >> > >> >> > - we run out of entries in the map > >> >> > >> >> You mean if we run the iteration counter to zero, right? > >> >> > >> >> > The return value of the last-executed program in the chain seems to be > >> >> > what determines packet processing behaviour after executing the chain > >> >> > (_DROP, _TX, _PASS, etc). So there's no way to both XDP_PASS and > >> >> > XDP_TX a packet from the same chain, right? Just want to make sure > >> >> > I've got the semantics correct. Thanks! > >> >> > >> >> Yeah, you've got all this right. The chain call mechanism itself doesn't > >> >> change any of the underlying fundamentals of XDP. I.e., each packet gets > >> >> exactly one verdict. > >> >> > >> >> For chaining actual XDP programs that do different things to the packet, > >> >> I expect that the most common use case will be to only run the next > >> >> program if the previous one returns XDP_PASS. That will make the most > >> >> semantic sense I think. > >> >> > >> >> But there are also use cases where one would want to match on the other > >> >> return codes; such as packet capture, for instance, where one might > >> >> install a capture program that would carry forward the previous return > >> >> code, but do something to the packet (throw it out to userspace) first. > >> >> > >> >> For the latter use case, the question is if we need to expose the > >> >> previous return code to the program when it runs. You can do things > >> >> without it (by just using a different program per return code), but it > >> >> may simplify things if we just expose the return code. However, since > >> >> this will also change the semantics for running programs, I decided to > >> >> leave that off for now. > >> >> > >> >> -Toke > >> > > >> > In other cases where programs (e.g. cgroups) are run in an array the > >> > return codes are 'AND'ed together so that we get > >> > > >> > result1 & result2 & ... & resultN But the XDP return codes are not bit values, so AND operation doesn't make sense to me. > >> > >> How would that work with multiple programs, though? PASS -> DROP seems > >> obvious, but what if the first program returns TX? Also, programs may > >> want to be able to actually override return codes (e.g., say you want to > >> turn DROPs into REDIRECTs, to get all your dropped packets mirrored to > >> your IDS or something). > > > > In general I think either you hard code a precedence that will have to > > be overly conservative because if one program (your firewall) tells > > XDP to drop the packet and some other program redirects it, passes, > > etc. that seems incorrect to me. Or you get creative with the > > precedence rules and they become complex and difficult to manage, > > where a drop will drop a packet unless a previous/preceding program > > redirects it, etc. I think any hard coded precedence you come up with > > will make some one happy and some other user annoyed. Defeating the > > programability of BPF. > > Yeah, exactly. That's basically why I punted on that completely. > Besides, technically you can get this by just installing different > programs in each slot if you really need it. I would really like to avoid hard coding precedence. I know it is "challenging" that we want to allow overruling any XDP return code, but I think it makes sense and it is the most flexible solution. > > Better if its programmable. I would prefer to pass the context into > > the next program then programs can build their own semantics. Then > > leave the & of return codes so any program can if needed really drop a > > packet. The context could be pushed into a shared memory region and > > then it doesn't even need to be part of the program signature. > > Since it seems I'll be going down the rabbit hole of baking this into > the BPF execution environment itself, I guess I'll keep this in mind as > well. Either by stuffing the previous program return code into the > context object(s), or by adding a new helper to retrieve it. I would like to see the ability to retrieve previous program return code, and a new helper would be the simplest approach. As this could potentially simplify and compact the data-structure. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer