On Thu, Nov 18, 2010 at 11:43 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote: > Le jeudi 18 novembre 2010 à 22:39 +0800, Changli Gao a écrit : >> As only xt_counters are private to each CPU, we don't need to maintain >> a whole individual table for each CPU. >> >> In the kernel space, we use the memory of ipt_entry.counters to save a >> pointer to a percpu xt_counters. When iptables runs, it only update the >> counters on its own CPU. >> >> On non SMP platforms, no change is made. >> >> Only the code of iptables is converted. Thanks for reviews. >> > > Changli > > I answered you a (difficult) work was in progress, still you post a > patch that needs our review and time ? This is crazy. Sorry for interrupt, and thanks for review. I posted this patch to check if I was in the right direction. > > I am tempted to stop here. Oh well... > > Your way of allocating a percpu counter for each counter is a pure TLB > and cache line blower (up to two cache lines per counter), not counting > the time needed to load a new table with 10.000 entries. Some people > still use scripts with hundred of calls to iptables. > > percpu_alloc() is not meant to be used thousand of times per second. It > is not scalable. > > You consume 16 bytes per counter in the main table, while 4 bytes index > should be enough on SMP build. Most firewalls I know use two or four > cpus at most. I think we can't change the structure of ipt_entry, as it is exposed to userspace as an ABI. Though there is no need to keep the same structure in the kernel space, converting is a big work. :) > > They care about speed, not really because iptables duplicates table on > each cpu. By the way, not using NUMA can definitly hurt firewalls with > many rules, unless you make sure the main table is vmalloced() with node > distribution, not a single node. Even with this, this can hurt > latencies. This is what I worry about. I think only test can verify it. > > Allocating one contiguous percpu var for all counters is a must. > > Problem is : percpu alloc doesnt allow big allocations. > > #define PCPU_MIN_UNIT_SIZE PFN_ALIGN(32 << 10) > > So max allocation is 32 Kbytes, thats 2048 'xt_counters' only. > -> cannot really use pcpu-alloc, but a kmalloc_node() or vmalloc_node() > per cpu > Thanks. I'll try. -- Regards, Changli Gao(xiaosuo@xxxxxxxxx) -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html