Re: [PATCH nf] netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> On Thu, Sep 28, 2023 at 03:12:44PM +0200, Florian Westphal wrote:
> > nft_rbtree_gc_elem() walks back and removes the end interval element that
> > comes before the expired element.
> > 
> > There is a small chance that we've cached this element as 'rbe_ge'.
> > If this happens, we hold and test a pointer that has been queued for
> > freeing.
> > 
> > It also causes spurious insertion failures:
> > 
> > $ cat nft-test.20230921-143934.826.dMBwHt/test-testcases-sets-0044interval_overlap_0.1/testout.log
> > Error: Could not process rule: File exists
> > add element t s {  0 -  2 }
> >                    ^^^^^^
> > Failed to insert  0 -  2 given:
> > table ip t {
> >         set s {
> >                 type inet_service
> >                 flags interval,timeout
> >                 timeout 2s
> >                 gc-interval 2s
> >         }
> > }
> > 
> > The set (rbtree) is empty. The 'failure' doesn't happen on next attempt.
> > 
> > Reason is that when we try to insert, the tree may hold an expired
> > element that collides with the range we're adding.
> > While we do evict/erase this element, we can trip over this check:
> > 
> > if (rbe_ge && nft_rbtree_interval_end(rbe_ge) && nft_rbtree_interval_end(new))
> >       return -ENOTEMPTY;
> > 
> > rbe_ge was erased by the synchronous gc, we should not have done this
> > check.  Next attempt won't find it, so retry results in successful
> > insertion.
> > 
> > Restart in-kernel to avoid such spurious errors.
> > 
> > Such restart are rare, unless userspace intentionally adds very large
> > numbers of elements with very short timeouts while setting a huge
> > gc interval.
> > 
> > Even in this case, this cannot loop forever, on each retry an existing
> > element has been removed.
> > 
> > As the caller is holding the transaction mutex, its impossible
> > for a second entity to add more expiring elements to the tree.
> > 
> > After this it also becomes feasible to remove the async gc worker
> > and perform all garbage collection from the commit path.
> > 
> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx>
> > ---
> >  Changes since v1:
> >   - restart entire insertion process in case we remove
> >   element that we held as lesser/greater overlap detection
> Thanks, I am still looking at finding a way to move this to .commit,
> if no better solution, then let's get this in for the next round.

I don't think its that bad.  In most cases, no restart is required
as no expired elements in the interesting range will ever be found.

I think its fine really, no need to go for two trees or anything
like pipapo is doing.

I have a few patches that build on top of this, first one
gets rid of async worker and places it in commit.



[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux