Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: > > The only reason left as to why we need percpu duplication are the rule > > counters embedded into ipt_entry et al -- since each cpu has its own copy > > of the rules, all counters can be lockless. > > > > The downside is that the more cpus are supported, the more memory is > > required. Rules are not just duplicated per online cpu but for each > > possible cpu, i.e. if maxcpu is 144, then rule is duplicated 144 times, > > not for the e.g. 64 cores present. > > > > To save some memory and also improve utilization of shared caches it > > would be preferable to only store the rule blob once. > > > > So we first need to separate counters and the rule blob. > > > > Instead of using entry->counters, allocate this percpu and store the > > percpu address in entry->counters.pcnt on CONFIG_SMP. > > > > This change makes no sense as-is; it is merely an intermediate step to > > remove the percpu duplication of the rule set in a followup patch. > > > > Suggested-by: Eric Dumazet <edumazet@xxxxxxxxxx> > > Acked-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> > > Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@xxxxxxxxx> > > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > > --- [..] > > +++ b/include/linux/netfilter/x_tables.h > > +/* On SMP, ip(6)t_entry->counters.pcnt holds address of the > > + * real (percpu) counter. On !SMP, its just the packet count, > > + * so nothing needs to be done there. > > + * > > + * xt_percpu_counter_alloc returns the address of the percpu > > + * counter, or 0 on !SMP. > > + * > > + * Hence caller must use IS_ERR_VALUE to check for error, this > > + * allows us to contain CONFIG_SMP kludgery. > > + */ > > +static inline u64 xt_percpu_counter_alloc(void) > > +{ > > +#ifdef CONFIG_SMP > > + void *res = alloc_percpu(struct xt_counters); > > + > > + if (res == NULL) > > + return (u64) -ENOMEM; > > + > > + return (__force u64) res; > > +#else > > + return 0; > > +#endif > > +} > > > The u64 stuff looks strange on a 32bit kernel ;) Hmmm, not sure how to avoid that. The packet and byte counters are always u64, and I don't think it will help to provide different helpers for 32 vs. 64bit system. > > +static inline void xt_percpu_counter_free(u64 pcnt) > > +{ > > +#ifdef CONFIG_SMP > > + free_percpu((__force void *) pcnt); > > +#endif > > +} > > + > > +static inline struct xt_counters * > > +xt_get_this_cpu_counter(struct xt_counters *cnt) > > +{ > > +#ifdef CONFIG_SMP > > + return this_cpu_ptr((void *) cnt->pcnt); > > +#else > > + return cnt; > > +#endif > > +} > > + > > +static inline struct xt_counters * > > +xt_get_per_cpu_counter(struct xt_counters *cnt, unsigned int cpu) > > +{ > > +#ifdef CONFIG_SMP > > + return per_cpu_ptr((void *) cnt->pcnt, cpu); > > +#else > > + return cnt; > > +#endif > > +} > > + > > So I understand struct xt_counters is exported to user space > (include/uapi/linux/netfilter/x_tables.h) > > But maybe you could use in the kernel > > union xt_kcounters { > struct xt_counters __percpu *pcpu; > struct xt_counters counters; > }; > > This way, you could avoid a lot of casts ? I tried that but I did not find out how to use this to improve readability :-| static inline bool xt_percpu_counter_alloc(union xt_kcounters *cnt) { #ifdef CONFIG_SMP cnt->pcpu = alloc_percpu(struct xt_counters); return cnt->pcpu != NULL; #else return true; #endif } ... looks ok BUT it means call site looks like if (xt_percpu_counter_alloc((union xt_kcounters *) &e->counters)) /* err */ which I find worse :-/ I could add anon union in ipt_entry to avoid casts but I remember there were build failure problems with anon unions & older gcc releases. Is there any idea you have wrt. not leaking percpu yes|no details to ip(6)_tables.c while still avoiding the casts outside the static inline helpers above? > Please run sparse to make sure you did not miss an annotation. > > include/linux/netfilter/x_tables.h:370:21: warning: incorrect type in initializer (different address spaces) > include/linux/netfilter/x_tables.h:370:21: expected void *res > include/linux/netfilter/x_tables.h:370:21: got struct xt_counters [noderef] <asn:3>*<noident> Ugh. I sprinkled a few more __percpu annotations on x_tables.h and these are gone now. Thanks Eric. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html