While working on speeding up pipapo rule insertions I found that the map_index needs to be percpu and per-set, not just percpu. At this time its possible for a pipapo set to fill the all-zero part with ones and take the 'might have bits set' as 'start-from-zero' area. First patch changes scratchpad area to a structure that provides space for a per-set-and-cpu toggle and uses it of the percpu one. Second patch prepares for patch 3, adds a new free helper. Third patch removes the scratch_aligned pointer and makes AVX2 implementation use the exact same memory addresses for read/store of the matching state. Florian Westphal (3): netfilter: nft_set_pipapo: store index in scratch maps netfilter: nft_set_pipapo: add helper to release pcpu scratch area netfilter: nft_set_pipapo: remove scratch_aligned pointer net/netfilter/nft_set_pipapo.c | 96 +++++++++++++---------------- net/netfilter/nft_set_pipapo.h | 18 ++++-- net/netfilter/nft_set_pipapo_avx2.c | 17 +++-- 3 files changed, 63 insertions(+), 68 deletions(-) -- 2.43.0