Re: [PATCH RFC] netfilter: nf_tables: add flowtable map for xdp offload

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Florian Westphal <fw@xxxxxxxxx> writes:

> This adds a small internal mapping table so that a new bpf (xdp) kfunc
> can perform lookups in a flowtable.
> I have no intent to push this without nft integration of the xdp program,
> this RFC is just to get comments on the general direction because there
> is a chicken/egg issue:
> As-is, xdp program has access to the device pointer, but no way to do a
> lookup in a flowtable -- there is no way to obtain the needed struct
> without whacky stunts.

So IIUC correctly, this would all be controlled by userspace anyway (by
the nft binary), right? In which case, couldn't userspace also provide
the reference to the right flowtable instance, by sticking it into a bpf
map? We'd probably need some special handling on the UAPI side to insert
a flowtable pointer, but from the BPF side it could just look like a
kptr in a map that the program pulls out and passes to the lookup kfunc.
And the map would take a refcnt, making sure the table doesn't disappear
underneath the XDP program. It could even improve performance since
there would be one less hashtable lookup.

The drawback would be that this would make it harder to integrate into
other XDP data planes, as you'd need to coordinate with nft to keep the
right flowtable references alive even if nft doesn't control the XDP
program. But maybe that's doable, somehow?


> My thinking is to add a xdp-offload flag to the nft grammer only.
> Its not needed on nf uapi side and it would tell nft to attach the xdp
> flowtable forward program to the devices listed in the flowtable.
> Also, packet flow is altered (qdiscs is bypassed), which is a strong
> argument against default-usage.

I agree that at this point XDP has two many quirks to be something we
can turn on by default. However, I think we should support XDP data
planes that are not necessarily under the control of nft itself.
Specifically, I am planning to add an 'xdp-forward' utility to xdp-tools
which would enable a semi-automatic XDP fast path using both this and
other hooks like the fib lookup helper. So it would be nice to make the
different pieces as loosely coupled as is practical (cf what I wrote

> Open questions:
> Do we need to support dev-in-multiple flowtables?  I would like to
> avoid this, this likely means the future "xdp" flag in nftables would
> be restricted to "inet" family.  Alternative would be to change the key to
> 'device address plus protocol family', the xdp prog could derive that from the
> packet data.

We can always start with the simple case and add more options later if
it turns out to be useful? With kfuncs we do have some flexibility in
terms of adjusting the API (although I think we should strive for
keeping it as stable as we can).

> Timeout handling.  Should the XDP program even bother to refresh the
> flowtable timeout?
> It might make more sense to intentionally have packets
> flow through the normal path periodically so neigh entries are up to
> date.

Hmm, I see what you mean, but I worry that this would lead to some nasty
latency blips when a flow transitions back and forth between kernel and
XDP paths. Also, there's a reordering problem as the state is changed:
the first goes through the stack, sets the flow state to active, then
gets transmitted. But while that sits in the qdisc waiting to go out on
the wire, the next packet arrives, gets handled by the XDP fastpath and
ends up overtaking the first packet on the TX side. Not sure we have a
good solution for this in general :(


[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux