Re: [PATCH v2 -next 2/2] netfilter: store rules per NUMA node instead of per cpu

Florian Westphal <fw@xxxxxxxxx> · Thu, 28 May 2015 23:52:23 +0200

Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> On Thu, 2015-05-28 at 22:51 +0200, Florian Westphal wrote:
> > We store rule blob per (possible) cpu.  Unfortunately this means we can
> > waste lot of memory on big smp machines. ipt_entry structure ('rule head')
> > is 112 byte, so e.g. with maxcpu=64 one single rule eats close to 8k RAM.
> > 
> > Since previous patch moved counters to separate percpu blob, it appears
> > there is nothing left in the rule blob that must be percpu.
> > 
> > Thus only duplicate the rule blob for each NUMA node.
> > 
> > On my test system (144 possible cpus, one numa node, 400k dummy rules) this
> > change saves close to 9 Gigabyte of RAM.
> > 
> > Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@xxxxxxxxx>
> > Acked-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>
> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx>
> > ---
> 
> Really if the program is now readonly, I would keep a single copy in
> memory.

Some matches (limit for instance) store kernel data ptr in their
matchinfo data (from checkentry hook, not per packet match function),
so its not 100% readonly.

> Are we copying kernel text to each NUMA node ? ;)

Beats me.  I was under impression that cpu accessing memory on other node
takes access penalty, thats why I changed it to per node allocation.

Is it insignificant in practice?

If so, I can respin it w/o the numa duplication; we can still add it
back later if needed.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html