Kumar Kartikeya Dwivedi wrote: > On Wed, Jun 02, 2021 at 11:24:36PM IST, Martin KaFai Lau wrote: > > On Wed, Jun 02, 2021 at 10:48:02AM +0200, Toke Høiland-Jørgensen wrote: > > > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes: > > > > > > >> > In general the garbage collection in any form doesn't scale. > > > >> > The conntrack logic doesn't need it. The cillium conntrack is a great > > > >> > example of how to implement a conntrack without GC. > > > >> > > > >> That is simply not a conntrack. We expire connections based on > > > >> its time, not based on the size of the map where it residents. > > > > > > > > Sounds like your goal is to replicate existing kernel conntrack > > > > as bpf program by doing exactly the same algorithm and repeating > > > > the same mistakes. Then add kernel conntrack functions to allow list > > > > of kfuncs (unstable helpers) and call them from your bpf progs. > > > > > > FYI, we're working on exactly this (exposing kernel conntrack to BPF). > > > Hoping to have something to show for our efforts before too long, but > > > it's still in a bit of an early stage... > > Just curious, what conntrack functions will be made callable to BPF? > > Initially we're planning to expose the equivalent of nf_conntrack_in and > nf_conntrack_confirm to XDP and TC programs (so XDP one works without an skb, > and TC one works with an skb), to map these to higher level lookup/insert. > > -- > Kartikeya I think this is a missed opportunity. I can't see any advantage to tying a XDP datapath into nft. For local connections use a socket lookup no need for tables at all. For middle boxes you need some tables, but again really don't see why you want nft here. An entirely XDP based connection tracker is going to be faster, easier to debug, and more easy to tune to do what you want as your use cases changes. Other than architecture disagreements, the implementation of this gets ugly. You will need to export a set of nft hooks, teach nft about xdp_buffs and then on every packet poke nft. Just looking at nf_conntrack_in() tells me you likely need some serious surgery there to make this work and now you've forked a bunch of code that could be done generically in BPF into some C hard coded stuff you will have to maintain. Or you do an ugly hack to convert xdp into skb on every packet, but I'll NAK that because its really defeats the point of XDP. Maybe TC side is easier because you have skb, but then you miss the real win in XDP side. Sorry I don't see any upsides here and just more work to review, maintain code that is dubious to start with. Anyways original timers code above LGTM. .John