On Wed, 10 Apr 2024 16:48:49 +0200 Florian Westphal <fw@xxxxxxxxx> wrote: > Pablo reports a crash with large batches of elements with a > back-to-back add/remove pattern. Quoting Pablo: > > add_elem("00000000") timeout 100 ms > ... > add_elem("0000000X") timeout 100 ms > del_elem("0000000X") <---------------- delete one that was just added > ... > add_elem("00005000") timeout 100 ms > > 1) nft_pipapo_remove() removes element 0000000X > Then, KASAN shows a splat. > > Looking at the remove function there is a chance that we will drop a > rule that maps to a non-deactivated element. > > Removal happens in two steps, first we do a lookup for key k and return the > to-be-removed element and mark it as inactive in the next generation. > Then, in a second step, the element gets removed from the set/map. > > The _remove function does not work correctly if we have more than one > element that share the same key. > > This can happen if we insert an element into a set when the set already > holds an element with same key, but the element mapping to the existing > key has timed out or is not active in the next generation. Uh-oh, I didn't imagine that could happen, thanks for fixing this. The fix looks correct to me. Just one nit: > In such case its possible that removal will unmap the wrong element. > If this happens, we will leak the non-deactivated element, it becomes > unreachable. > > The element that got deactivated (and will be freed later) will > remain reachable in the set data structure, this can result in > a crash when such an element is retrieved during lookup (stale > pointer). > > Add a check that the fully matching key does in fact map to the element > that we have marked as inactive in the deactivation step. > If not, we need to continue searching. > > Add a bug/warn trap at the end of the function as well, the remove > function must not ever be called with an invisible/unreachable/non-existent > element. > > Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") > Reported-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > --- > net/netfilter/nft_set_pipapo.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c > index df8de5090246..09c3eedc879b 100644 > --- a/net/netfilter/nft_set_pipapo.c > +++ b/net/netfilter/nft_set_pipapo.c > @@ -2077,6 +2077,8 @@ static void nft_pipapo_remove(const struct net *net, const struct nft_set *set, > rules_fx = rules_f0; > > nft_pipapo_for_each_field(f, i, m) { > + bool last = i == m->field_count - 1; > + > if (!pipapo_match_field(f, start, rules_fx, > match_start, match_end)) > break; > @@ -2089,16 +2091,22 @@ static void nft_pipapo_remove(const struct net *net, const struct nft_set *set, > > match_start += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); > match_end += NFT_PIPAPO_GROUPS_PADDED_SIZE(f); > - } > > - if (i == m->field_count) { > - priv->dirty = true; > - pipapo_drop(m, rulemap); > - return; > + if (last) { > + const struct nft_pipapo_elem *this = f->mt[rulemap[i].to].e; > + > + if (this == e) { To avoid this very long line, we could probably assign 'this' on a separate line, or even just: if (f->mt[rulemap[i].to].e == e) { which I find equally readable. Either way, Reviewed-by: Stefano Brivio <sbrivio@xxxxxxxxxx> -- Stefano