On Mon, Oct 02, 2023 at 10:47:46AM +0200, Florian Westphal wrote: > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > Looking at your series, I don't think we are that far each other, see > > below. > > Agree. > > > On Sun, Oct 01, 2023 at 11:08:16PM +0200, Florian Westphal wrote: > > > I've pushed a (not very much tested) version of gc overhaul > > > to passive lookups based on expiry candidates, this removes > > > the need for gc sequence counters. > > > > This patch ("netfilter: nft_set_rbtree: prefer sync gc to async > > worker") > > > > https://git.kernel.org/pub/scm/linux/kernel/git/fwestphal/nf.git/commit/?h=nft_set_gc_query_08&id=edfeb02d758d6a96a3c1c9a483b69e43e5528e87 > > > > goes in the same direction I would like to go with my incomplete patch > > I posted. However: > > > > +static void nft_rbtree_commit(struct nft_set *set) > > +{ > > + struct nft_rbtree *priv = nft_set_priv(set); > > + > > + if (time_after_eq(jiffies, priv->last_gc + nft_set_gc_interval(set))) > > + nft_rbtree_gc(set); > > +} > > > > I don't think this time_after_eq() to postpone element removal will > > work. According to Stefano, you cannot store in the rbtree tree > > duplicated elements. > > Note that in this series the on-demand part is still in place, > there will be no duplicate elements. Right. > > Same problem already exists for this set backend > > in case a transaction add and delete elements in the same batch. > > Unless we maintain two copies. I understand you don't want to maintain > > the two copies but then this time_after_eq() needs to go away. > > I can remove it, I don't think a full traversal (without doing > anything) will be too costly. OK, so what is your proposal to move on? > > According to what I read it seems we agree on that, the only subtle > > difference between your patch and my incomplete patch is this > > time_after_eq(). > > Yes, your patch gets rid of on-demand gc, I agree that we cannot > postpone full run in that case. Yes. > > > Its vs. nf.git but really should be re-targetted to nf-next, I'll > > > try to do this next week: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/fwestphal/nf.git/log/?h=nft_set_gc_query_08 > > > > Thanks. The gc sequence removal is a different topic we have been > > discussing for a while. > > Yup. I wanted to explore how much work this is, and it turns > out it gets a lot less ugly of we don't have to hande rbtree and > its end elements. OK. > > Would it be possible to incorrect zap an entry > > with the transaction semantics? I mean: > > Nope, should not happen. > > > #1 transaction to remove element k in set x y > > #2 flush set x y (removes dead element k) > > #3 add element k to set x y expires 3 minutes > > #4 gc transaction freshly added new element > > > > In this case, no dead flag is set on in this new element k on so GC > > transaction will skip it. > > The GC will do lookup, will find the element, will > see its neither dead nor expired so it will be skipped. > > At least thats the idea, entries get zapped only > if they are expired or dead (to handle packet path deletion). Agreed, it is an extra lookup, but it is safer approach. Thanks.