Re: nft_pipapo_avx2_lookup backtrace in linux 5.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 7 May 2021 12:52:47 +0200
Stefano Brivio <sbrivio@xxxxxxxxxx> wrote:

> On Fri, 7 May 2021 12:35:23 +0200
> Florian Westphal <fw@xxxxxxxxx> wrote:
> 
> > Arturo Borrero Gonzalez <arturo@xxxxxxxxxxxxx> wrote:  
> > > Hi there,
> > > 
> > > I got this backtrace in one of my servers. I wonder if it is known or fixed
> > > already in a later version.
> > > 
> > > My versions:
> > > * kernel 5.10.24
> > > * nft 0.9.6
> > > 
> > > Also, find attached the ruleset that triggered this.
> > > 
> > > [Thu May  6 16:20:21 2021] ------------[ cut here ]------------
> > > [Thu May  6 16:20:21 2021] WARNING: CPU: 3 PID: 456 at
> > > arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xc9/0xe0
> > > [Thu May  6 16:20:21 2021] Modules linked in: binfmt_misc nft_nat    
> > 
> > Hmm, I suspect this is needed (not even compile tested).
> > 
> > diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
> > --- a/net/netfilter/nft_set_pipapo_avx2.c
> > +++ b/net/netfilter/nft_set_pipapo_avx2.c
> > @@ -1105,6 +1105,18 @@ bool nft_pipapo_avx2_estimate(const struct nft_set_desc *desc, u32 features,
> >  	return true;
> >  }
> >  
> > +static void nft_pipapo_avx_begin(void)
> > +{
> > +	local_bh_disable();
> > +	kernel_fpu_begin();
> > +}
> >
> > [...]
> > 
> > kernel_fpu_begin() disables preemption, but we can still reenter via
> > softirq.  
> 
> Right... if that's enough (I'm quite convinced), and the overhead is
> negligible (not as much... I'll test), I would prefer this to the
> fallback option on !irq_fpu_usable() -- it's simpler.

Hmm, wait, the overhead is actually negligible, but I don't think
calling local_bh_disable() from the lookup function would actually
help: crc32c_pcl_intel_update() runs from a kthread, and while it's
using the FPU the softirq triggers, not the other way around... right?

I think we really need to check that the FPU isn't already in use by
the kernel with irq_fpu_usable() instead, just like
crc32c_pcl_intel_update() does.

Arturo, there's one thing confusing me here: checking 5.10.24, we're
hitting:

	WARN_ON_FPU(this_cpu_read(in_kernel_fpu));

at line 129 of arch/x86/kernel/fpu/core.c, but not:

	WARN_ON_FPU(!irq_fpu_usable());

at line 128? Those should be equivalent in this situation, because
irq_fpu_usable() checks:

	!in_interrupt() -> false (softirq here) ||

	interrupted_user_mode() -> false (judging from backtrace) ||

	interrupted_kernel_fpu_idle()
		== !!this_cpu_read(in_kernel_fpu);
		-> must be true if warning at line 129 triggers

...I see from tainted flags that some warning was already printed,
could it be that you have a warning from arch/x86/kernel/fpu/core.c:128
in your logs, before this one?

Florian, now that set back-ends are built-in, I'd simply go with
something like (oink):

diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index 528a2d7ca991..dce866d93fee 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -408,8 +408,8 @@ int pipapo_refill(unsigned long *map, int len, int rules, unsigned long *dst,
  *
  * Return: true on match, false otherwise.
  */
-static bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
-                             const u32 *key, const struct nft_set_ext **ext)
+bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
+                      const u32 *key, const struct nft_set_ext **ext)
 {
        struct nft_pipapo *priv = nft_set_priv(set);
        unsigned long *res_map, *fill_map;
diff --git a/net/netfilter/nft_set_pipapo.h b/net/netfilter/nft_set_pipapo.h
index 25a75591583e..d84afb8fa79a 100644
--- a/net/netfilter/nft_set_pipapo.h
+++ b/net/netfilter/nft_set_pipapo.h
@@ -178,6 +178,8 @@ struct nft_pipapo_elem {
 
 int pipapo_refill(unsigned long *map, int len, int rules, unsigned long *dst,
                  union nft_pipapo_map_bucket *mt, bool match_only);
+bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
+                      const u32 *key, const struct nft_set_ext **ext);
 
 /**
  * pipapo_and_field_buckets_4bit() - Intersect 4-bit buckets
diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
index d65ae0e23028..eabdb8d552ee 100644
--- a/net/netfilter/nft_set_pipapo_avx2.c
+++ b/net/netfilter/nft_set_pipapo_avx2.c
@@ -1131,6 +1131,9 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
        bool map_index;
        int i, ret = 0;
 
+       if (unlikely(!irq_fpu_usable()))
+               return nft_pipapo_lookup(net, set, key, ext);
+
        m = rcu_dereference(priv->match);
 
        /* This also protects access to all data related to scratch maps */


-- 
Stefano




[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux