Re: [PATCH nf] nft_set_rbtree: Move clauses for expired nodes, last active node as leaf

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stefano,

On Thu, May 12, 2022 at 08:34:21PM +0200, Stefano Brivio wrote:
> In the overlap detection performed as part of the insertion operation,
> we have to skip nodes that are not active in the current generation or
> expired. This was done by adding several conditions in overlap decision
> clauses, which made it hard to check for correctness, and debug the
> actual issue this patch is fixing.
> 
> Consolidate these conditions into a single clause, so that we can
> proceed with debugging and fixing the following issue.
> 
> Case b3. described in comments covers the insertion of a start
> element after an existing end, as a leaf. If all the descendants of
> a given element are inactive, functionally, for the purposes of
> overlap detection, we still have to treat this as a leaf, but we don't.
> 
> If, in the same transaction, we remove a start element to the right,
> remove an end element to the left, and add a start element to the right
> of an existing, active end element, we don't hit case b3. For example:
> 
> - existing intervals: 1-2, 3-5, 6-7
> - transaction: delete 3-5, insert 4-5
> 
> node traversal might happen as follows:
> - 2 (active end)
> - 5 (inactive end)
> - 3 (inactive start)
> 
> when we add 4 as start element, we should simply consider 2 as
> preceding end, and consider it as a leaf, because interval 3-5 has been
> deleted. If we don't, we'll report an overlap because we forgot about
> this.
> 
> Add an additional flag which is set as we find an active end, and reset
> it if we find an active start (to the left). If we finish the traversal
> with this flag set, it means we need to functionally consider the
> previous active end as a leaf, and allow insertion instead of reporting
> overlap.

I can still trigger EEXIST with deletion of existing interval. It
became harder to reproduce after this patch.

After hitting EEXIST, if I do:

        echo "flush ruleset" > test.nft
        nft list ruleset >> test.nft

to dump the existing ruleset, then I run the delete element command
again to remove the interval and it works. Before this patch I could
reproduce it by reloading an existing ruleset dump.

I'm running the script that I'm attaching manually, just one manual
invocation after another.

I ocassionally hit ENOBUFS, but that sounds like a different issue
that I'm looking into.

Thanks.

> Reported-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
> Fixes: 7c84d41416d8 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion")
> Signed-off-by: Stefano Brivio <sbrivio@xxxxxxxxxx>
> ---
>  net/netfilter/nft_set_rbtree.c | 92 ++++++++++++++++++++--------------
>  1 file changed, 54 insertions(+), 38 deletions(-)
> 
> diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
> index 7325bee7d144..dc2184bbe722 100644
> --- a/net/netfilter/nft_set_rbtree.c
> +++ b/net/netfilter/nft_set_rbtree.c
> @@ -222,6 +222,7 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
>  	bool overlap = false, dup_end_left = false, dup_end_right = false;
>  	struct nft_rbtree *priv = nft_set_priv(set);
>  	u8 genmask = nft_genmask_next(net);
> +	bool last_left_node_is_end = false;
>  	struct nft_rbtree_elem *rbe;
>  	struct rb_node *parent, **p;
>  	int d;
> @@ -287,80 +288,95 @@ static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
>  		d = memcmp(nft_set_ext_key(&rbe->ext),
>  			   nft_set_ext_key(&new->ext),
>  			   set->klen);
> -		if (d < 0) {
> +
> +		if (d < 0)
>  			p = &parent->rb_left;
> +		else if (d > 0)
> +			p = &parent->rb_right;
> +		else if (nft_rbtree_interval_end(rbe))
> +			p = &parent->rb_left;
> +		else
> +			p = &parent->rb_right;
>  
> +		/* There might be inactive elements in the tree: ignore them by
> +		 * traversing them without affecting flags.
> +		 *
> +		 * We need to reset the dup_end_left and dup_end_right flags,
> +		 * though, because those only apply to adjacent nodes.
> +		 */
> +		if (!nft_set_elem_active(&rbe->ext, genmask) ||
> +		    nft_set_elem_expired(&rbe->ext)) {
> +			dup_end_left = dup_end_right = false;
> +			continue;
> +		}
> +
> +		if (d < 0) {
>  			if (nft_rbtree_interval_start(new)) {
> -				if (nft_rbtree_interval_end(rbe) &&
> -				    nft_set_elem_active(&rbe->ext, genmask) &&
> -				    !nft_set_elem_expired(&rbe->ext) && !*p)
> -					overlap = false;
> +				/* See case b3. described above.
> +				 *
> +				 * If this is not a leaf, but all nodes below
> +				 * this one are inactive, except for a leaf, we
> +				 * still have to consider it a leaf for the
> +				 * purposes of overlap detection.
> +				 *
> +				 * Set last_left_node_is_end if this is not a
> +				 * leaf and an active end element, and reset it
> +				 * if we find an active start element to the
> +				 * left.
> +				 *
> +				 * If we end the traversal with this flag set,
> +				 * this node is a leaf for the purposes of case
> +				 * b3., and no overlap will be reported.
> +				 */
> +				if (nft_rbtree_interval_end(rbe)) {
> +					if (!*p)
> +						overlap = false;
> +					else
> +						last_left_node_is_end = true;
> +				} else {
> +					last_left_node_is_end = false;
> +				}
>  			} else {
> +
>  				if (dup_end_left && !*p)
>  					return -ENOTEMPTY;
>  
> -				overlap = nft_rbtree_interval_end(rbe) &&
> -					  nft_set_elem_active(&rbe->ext,
> -							      genmask) &&
> -					  !nft_set_elem_expired(&rbe->ext);
> -
> +				overlap = nft_rbtree_interval_end(rbe);
>  				if (overlap) {
>  					dup_end_right = true;
>  					continue;
>  				}
>  			}
>  		} else if (d > 0) {
> -			p = &parent->rb_right;
> -
>  			if (nft_rbtree_interval_end(new)) {
>  				if (dup_end_right && !*p)
>  					return -ENOTEMPTY;
>  
> -				overlap = nft_rbtree_interval_end(rbe) &&
> -					  nft_set_elem_active(&rbe->ext,
> -							      genmask) &&
> -					  !nft_set_elem_expired(&rbe->ext);
> -
> +				overlap = nft_rbtree_interval_end(rbe);
>  				if (overlap) {
>  					dup_end_left = true;
>  					continue;
>  				}
> -			} else if (nft_set_elem_active(&rbe->ext, genmask) &&
> -				   !nft_set_elem_expired(&rbe->ext)) {
> +			} else {
>  				overlap = nft_rbtree_interval_end(rbe);
>  			}
>  		} else {
>  			if (nft_rbtree_interval_end(rbe) &&
>  			    nft_rbtree_interval_start(new)) {
> -				p = &parent->rb_left;
> -
> -				if (nft_set_elem_active(&rbe->ext, genmask) &&
> -				    !nft_set_elem_expired(&rbe->ext))
> -					overlap = false;
> +				overlap = false;
>  			} else if (nft_rbtree_interval_start(rbe) &&
>  				   nft_rbtree_interval_end(new)) {
> -				p = &parent->rb_right;
> -
> -				if (nft_set_elem_active(&rbe->ext, genmask) &&
> -				    !nft_set_elem_expired(&rbe->ext))
> -					overlap = false;
> -			} else if (nft_set_elem_active(&rbe->ext, genmask) &&
> -				   !nft_set_elem_expired(&rbe->ext)) {
> +				overlap = false;
> +			} else {
>  				*ext = &rbe->ext;
>  				return -EEXIST;
> -			} else {
> -				overlap = false;
> -				if (nft_rbtree_interval_end(rbe))
> -					p = &parent->rb_left;
> -				else
> -					p = &parent->rb_right;
>  			}
>  		}
>  
>  		dup_end_left = dup_end_right = false;
>  	}
>  
> -	if (overlap)
> +	if (overlap && !last_left_node_is_end)
>  		return -ENOTEMPTY;
>  
>  	rb_link_node_rcu(&new->node, parent, p);
> -- 
> 2.35.1
> 

Attachment: rbtree-bug.sh
Description: Bourne shell script


[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux