Patch "netfilter: nft_set_hash: skip duplicated elements pending gc run" has been added to the 5.15-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    netfilter: nft_set_hash: skip duplicated elements pending gc run

to the 5.15-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     netfilter-nft_set_hash-skip-duplicated-elements-pend.patch
and it can be found in the queue-5.15 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit bc68880146f0e721d84010bb438233d55b5aaad1
Author: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
Date:   Mon Dec 2 00:04:49 2024 +0100

    netfilter: nft_set_hash: skip duplicated elements pending gc run
    
    [ Upstream commit 7ffc7481153bbabf3332c6a19b289730c7e1edf5 ]
    
    rhashtable does not provide stable walk, duplicated elements are
    possible in case of resizing. I considered that checking for errors when
    calling rhashtable_walk_next() was sufficient to detect the resizing.
    However, rhashtable_walk_next() returns -EAGAIN only at the end of the
    iteration, which is too late, because a gc work containing duplicated
    elements could have been already scheduled for removal to the worker.
    
    Add a u32 gc worker sequence number per set, bump it on every workqueue
    run. Annotate gc worker sequence number on the expired element. Use it
    to skip those already seen in this gc workqueue run.
    
    Note that this new field is never reset in case gc transaction fails, so
    next gc worker run on the expired element overrides it. Wraparound of gc
    worker sequence number should not be an issue with stale gc worker
    sequence number in the element, that would just postpone the element
    removal in one gc run.
    
    Note that it is not possible to use flags to annotate that element is
    pending gc run to detect duplicates, given that gc transaction can be
    invalidated in case of update from the control plane, therefore, not
    allowing to clear such flag.
    
    On x86_64, pahole reports no changes in the size of nft_rhash_elem.
    
    Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API")
    Reported-by: Laurent Fasnacht <laurent.fasnacht@xxxxxxxxx>
    Tested-by: Laurent Fasnacht <laurent.fasnacht@xxxxxxxxx>
    Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
index 1fd3b413350dc..5c4209b49bda7 100644
--- a/net/netfilter/nft_set_hash.c
+++ b/net/netfilter/nft_set_hash.c
@@ -24,10 +24,12 @@
 struct nft_rhash {
 	struct rhashtable		ht;
 	struct delayed_work		gc_work;
+	u32				wq_gc_seq;
 };
 
 struct nft_rhash_elem {
 	struct rhash_head		node;
+	u32				wq_gc_seq;
 	struct nft_set_ext		ext;
 };
 
@@ -339,6 +341,10 @@ static void nft_rhash_gc(struct work_struct *work)
 	if (!gc)
 		goto done;
 
+	/* Elements never collected use a zero gc worker sequence number. */
+	if (unlikely(++priv->wq_gc_seq == 0))
+		priv->wq_gc_seq++;
+
 	rhashtable_walk_enter(&priv->ht, &hti);
 	rhashtable_walk_start(&hti);
 
@@ -356,6 +362,14 @@ static void nft_rhash_gc(struct work_struct *work)
 			goto try_later;
 		}
 
+		/* rhashtable walk is unstable, already seen in this gc run?
+		 * Then, skip this element. In case of (unlikely) sequence
+		 * wraparound and stale element wq_gc_seq, next gc run will
+		 * just find this expired element.
+		 */
+		if (he->wq_gc_seq == priv->wq_gc_seq)
+			continue;
+
 		if (nft_set_elem_is_dead(&he->ext))
 			goto dead_elem;
 
@@ -372,6 +386,8 @@ static void nft_rhash_gc(struct work_struct *work)
 		if (!gc)
 			goto try_later;
 
+		/* annotate gc sequence for this attempt. */
+		he->wq_gc_seq = priv->wq_gc_seq;
 		nft_trans_gc_elem_add(gc, he);
 	}
 




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux