This is a note to let you know that I've just added the patch titled netfilter: nf_tables: don't skip expired elements during walk to the 5.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: netfilter-nf_tables-don-t-skip-expired-elements-during-walk.patch and it can be found in the queue-5.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From stable-owner@xxxxxxxxxxxxxxx Tue Nov 21 12:13:55 2023 From: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> Date: Tue, 21 Nov 2023 13:13:14 +0100 Subject: netfilter: nf_tables: don't skip expired elements during walk To: netfilter-devel@xxxxxxxxxxxxxxx Cc: gregkh@xxxxxxxxxxxxxxxxxxx, sashal@xxxxxxxxxx, stable@xxxxxxxxxxxxxxx Message-ID: <20231121121333.294238-8-pablo@xxxxxxxxxxxxx> From: Florian Westphal <fw@xxxxxxxxx> commit 24138933b97b055d486e8064b4a1721702442a9b upstream. There is an asymmetry between commit/abort and preparation phase if the following conditions are met: 1. set is a verdict map ("1.2.3.4 : jump foo") 2. timeouts are enabled In this case, following sequence is problematic: 1. element E in set S refers to chain C 2. userspace requests removal of set S 3. kernel does a set walk to decrement chain->use count for all elements from preparation phase 4. kernel does another set walk to remove elements from the commit phase (or another walk to do a chain->use increment for all elements from abort phase) If E has already expired in 1), it will be ignored during list walk, so its use count won't have been changed. Then, when set is culled, ->destroy callback will zap the element via nf_tables_set_elem_destroy(), but this function is only safe for elements that have been deactivated earlier from the preparation phase: lack of earlier deactivate removes the element but leaks the chain use count, which results in a WARN splat when the chain gets removed later, plus a leak of the nft_chain structure. Update pipapo_get() not to skip expired elements, otherwise flush command reports bogus ENOENT errors. Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Fixes: 8d8540c4f5e0 ("netfilter: nft_set_rbtree: add timeout support") Fixes: 9d0982927e79 ("netfilter: nft_hash: add support for timeouts") Signed-off-by: Florian Westphal <fw@xxxxxxxxx> Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- net/netfilter/nf_tables_api.c | 4 ++++ net/netfilter/nft_set_hash.c | 2 -- net/netfilter/nft_set_rbtree.c | 2 -- 3 files changed, 4 insertions(+), 4 deletions(-) --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -4258,8 +4258,12 @@ static int nf_tables_dump_setelem(const const struct nft_set_iter *iter, struct nft_set_elem *elem) { + const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv); struct nft_set_dump_args *args; + if (nft_set_elem_expired(ext)) + return 0; + args = container_of(iter, struct nft_set_dump_args, iter); return nf_tables_fill_setelem(args->skb, set, elem); } --- a/net/netfilter/nft_set_hash.c +++ b/net/netfilter/nft_set_hash.c @@ -277,8 +277,6 @@ static void nft_rhash_walk(const struct if (iter->count < iter->skip) goto cont; - if (nft_set_elem_expired(&he->ext)) - goto cont; if (!nft_set_elem_active(&he->ext, iter->genmask)) goto cont; --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -553,8 +553,6 @@ static void nft_rbtree_walk(const struct if (iter->count < iter->skip) goto cont; - if (nft_set_elem_expired(&rbe->ext)) - goto cont; if (!nft_set_elem_active(&rbe->ext, iter->genmask)) goto cont; Patches currently in stable-queue which might be from stable-owner@xxxxxxxxxxxxxxx are queue-5.4/netfilter-nf_tables-fix-memleak-when-more-than-255-elements-expired.patch queue-5.4/netfilter-nft_set_rbtree-fix-overlap-expiration-walk.patch queue-5.4/netfilter-nf_tables-use-correct-lock-to-protect-gc_list.patch queue-5.4/netfilter-nf_tables-disable-toggling-dormant-table-state-more-than-once.patch queue-5.4/netfilter-nf_tables-gc-transaction-race-with-netns-dismantle.patch queue-5.4/netfilter-nf_tables-drop-map-element-references-from-preparation-phase.patch queue-5.4/netfilter-nf_tables-fix-gc-transaction-races-with-netns-and-netlink-event-exit-path.patch queue-5.4/netfilter-nf_tables-don-t-skip-expired-elements-during-walk.patch queue-5.4/netfilter-nf_tables-remove-busy-mark-and-gc-batch-api.patch queue-5.4/netfilter-nf_tables-gc-transaction-race-with-abort-path.patch queue-5.4/netfilter-nf_tables-unregister-flowtable-hooks-on-netns-exit.patch queue-5.4/netfilter-nft_set_rbtree-switch-to-node-list-walk-for-overlap-detection.patch queue-5.4/netfilter-nf_tables-adapt-set-backend-to-use-gc-transaction-api.patch queue-5.4/netfilter-nftables-rename-set-element-data-activation-deactivation-functions.patch queue-5.4/netfilter-nft_set_rbtree-skip-sync-gc-for-new-elements-in-this-transaction.patch queue-5.4/netfilter-nf_tables-pass-context-to-nft_set_destroy.patch queue-5.4/netfilter-nf_tables-bogus-ebusy-when-deleting-flowtable-after-flush-for-5.4.patch queue-5.4/netfilter-nft_set_hash-try-later-when-gc-hits-eagain-on-iteration.patch queue-5.4/netfilter-nf_tables-defer-gc-run-if-previous-batch-is-still-pending.patch queue-5.4/netfilter-nft_set_rbtree-use-read-spinlock-to-avoid-datapath-contention.patch queue-5.4/netfilter-nf_tables-double-hook-unregistration-in-netns-path.patch queue-5.4/netfilter-nftables-update-table-flags-from-the-commit-phase.patch queue-5.4/netfilter-nft_set_hash-mark-set-element-as-dead-when-deleting-from-packet-path.patch queue-5.4/netfilter-nf_tables-gc-transaction-api-to-avoid-race-with-control-plane.patch queue-5.4/netfilter-nf_tables-fix-table-flag-updates.patch queue-5.4/netfilter-nft_set_rbtree-fix-null-deref-on-element-insertion.patch