Hi Florian, Looking at your series, I don't think we are that far each other, see below. On Sun, Oct 01, 2023 at 11:08:16PM +0200, Florian Westphal wrote: > I've pushed a (not very much tested) version of gc overhaul > to passive lookups based on expiry candidates, this removes > the need for gc sequence counters. This patch ("netfilter: nft_set_rbtree: prefer sync gc to async worker") https://git.kernel.org/pub/scm/linux/kernel/git/fwestphal/nf.git/commit/?h=nft_set_gc_query_08&id=edfeb02d758d6a96a3c1c9a483b69e43e5528e87 goes in the same direction I would like to go with my incomplete patch I posted. However: +static void nft_rbtree_commit(struct nft_set *set) +{ + struct nft_rbtree *priv = nft_set_priv(set); + + if (time_after_eq(jiffies, priv->last_gc + nft_set_gc_interval(set))) + nft_rbtree_gc(set); +} I don't think this time_after_eq() to postpone element removal will work. According to Stefano, you cannot store in the rbtree tree duplicated elements. Same problem already exists for this set backend in case a transaction add and delete elements in the same batch. Unless we maintain two copies. I understand you don't want to maintain the two copies but then this time_after_eq() needs to go away. This patch above to add .commit interface to rbtree basically undoes: https://git.kernel.org/pub/scm/linux/kernel/git/fwestphal/nf.git/commit/?h=nft_set_gc_query_08&id=ee48e86518d62db058efafb6ec1b9f426c441a9d so better fix it by adding the .commit interface to the rbtree in first place? According to what I read it seems we agree on that, the only subtle difference between your patch and my incomplete patch is this time_after_eq(). > Its vs. nf.git but really should be re-targetted to nf-next, I'll > try to do this next week: > > https://git.kernel.org/pub/scm/linux/kernel/git/fwestphal/nf.git/log/?h=nft_set_gc_query_08 Thanks. The gc sequence removal is a different topic we have been discussing for a while. Would it be possible to incorrect zap an entry with the transaction semantics? I mean: #1 transaction to remove element k in set x y #2 flush set x y (removes dead element k) #3 add element k to set x y expires 3 minutes #4 gc transaction freshly added new element In this case, no dead flag is set on in this new element k on so GC transaction will skip it. As for the element timeout update semantics, I will catch up in a separated email renaming this thread, as this is a new feature and I prefer to re-focus the conversation on your branch, it has been me that has been mixing up different topics anyway.