On Fri, 7 May 2021 12:52:47 +0200 Stefano Brivio <sbrivio@xxxxxxxxxx> wrote: > On Fri, 7 May 2021 12:35:23 +0200 > Florian Westphal <fw@xxxxxxxxx> wrote: > > > Arturo Borrero Gonzalez <arturo@xxxxxxxxxxxxx> wrote: > > > Hi there, > > > > > > I got this backtrace in one of my servers. I wonder if it is known or fixed > > > already in a later version. > > > > > > My versions: > > > * kernel 5.10.24 > > > * nft 0.9.6 > > > > > > Also, find attached the ruleset that triggered this. > > > > > > [Thu May 6 16:20:21 2021] ------------[ cut here ]------------ > > > [Thu May 6 16:20:21 2021] WARNING: CPU: 3 PID: 456 at > > > arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xc9/0xe0 > > > [Thu May 6 16:20:21 2021] Modules linked in: binfmt_misc nft_nat > > > > Hmm, I suspect this is needed (not even compile tested). > > > > diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c > > --- a/net/netfilter/nft_set_pipapo_avx2.c > > +++ b/net/netfilter/nft_set_pipapo_avx2.c > > @@ -1105,6 +1105,18 @@ bool nft_pipapo_avx2_estimate(const struct nft_set_desc *desc, u32 features, > > return true; > > } > > > > +static void nft_pipapo_avx_begin(void) > > +{ > > + local_bh_disable(); > > + kernel_fpu_begin(); > > +} > > > > [...] > > > > kernel_fpu_begin() disables preemption, but we can still reenter via > > softirq. > > Right... if that's enough (I'm quite convinced), and the overhead is > negligible (not as much... I'll test), I would prefer this to the > fallback option on !irq_fpu_usable() -- it's simpler. Hmm, wait, the overhead is actually negligible, but I don't think calling local_bh_disable() from the lookup function would actually help: crc32c_pcl_intel_update() runs from a kthread, and while it's using the FPU the softirq triggers, not the other way around... right? I think we really need to check that the FPU isn't already in use by the kernel with irq_fpu_usable() instead, just like crc32c_pcl_intel_update() does. Arturo, there's one thing confusing me here: checking 5.10.24, we're hitting: WARN_ON_FPU(this_cpu_read(in_kernel_fpu)); at line 129 of arch/x86/kernel/fpu/core.c, but not: WARN_ON_FPU(!irq_fpu_usable()); at line 128? Those should be equivalent in this situation, because irq_fpu_usable() checks: !in_interrupt() -> false (softirq here) || interrupted_user_mode() -> false (judging from backtrace) || interrupted_kernel_fpu_idle() == !!this_cpu_read(in_kernel_fpu); -> must be true if warning at line 129 triggers ...I see from tainted flags that some warning was already printed, could it be that you have a warning from arch/x86/kernel/fpu/core.c:128 in your logs, before this one? Florian, now that set back-ends are built-in, I'd simply go with something like (oink): diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index 528a2d7ca991..dce866d93fee 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -408,8 +408,8 @@ int pipapo_refill(unsigned long *map, int len, int rules, unsigned long *dst, * * Return: true on match, false otherwise. */ -static bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, - const u32 *key, const struct nft_set_ext **ext) +bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, + const u32 *key, const struct nft_set_ext **ext) { struct nft_pipapo *priv = nft_set_priv(set); unsigned long *res_map, *fill_map; diff --git a/net/netfilter/nft_set_pipapo.h b/net/netfilter/nft_set_pipapo.h index 25a75591583e..d84afb8fa79a 100644 --- a/net/netfilter/nft_set_pipapo.h +++ b/net/netfilter/nft_set_pipapo.h @@ -178,6 +178,8 @@ struct nft_pipapo_elem { int pipapo_refill(unsigned long *map, int len, int rules, unsigned long *dst, union nft_pipapo_map_bucket *mt, bool match_only); +bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set, + const u32 *key, const struct nft_set_ext **ext); /** * pipapo_and_field_buckets_4bit() - Intersect 4-bit buckets diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c index d65ae0e23028..eabdb8d552ee 100644 --- a/net/netfilter/nft_set_pipapo_avx2.c +++ b/net/netfilter/nft_set_pipapo_avx2.c @@ -1131,6 +1131,9 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set, bool map_index; int i, ret = 0; + if (unlikely(!irq_fpu_usable())) + return nft_pipapo_lookup(net, set, key, ext); + m = rcu_dereference(priv->match); /* This also protects access to all data related to scratch maps */ -- Stefano