Hi, On 10/21/2022 10:09 AM, Alexei Starovoitov wrote: > On Thu, Oct 20, 2022 at 7:06 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >> Hi, >> >> On 10/21/2022 9:48 AM, Alexei Starovoitov wrote: >>> On Fri, Oct 21, 2022 at 09:43:08AM +0800, Hou Tao wrote: >>>> Hi, >>>> >>>> On 10/21/2022 2:01 AM, Hao Luo wrote: >>>>> On Thu, Oct 20, 2022 at 6:57 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >>>>>> From: Hou Tao <houtao1@xxxxxxxxxx> >>>>>> >>>>>> Since commit fba1a1c6c912 ("bpf: Convert hash map to bpf_mem_alloc."), >>>>>> numa node setting for non-preallocated hash table is ignored. The reason >>>>>> is that bpf memory allocator only supports NUMA_NO_NODE, but it seems it >>>>>> is trivial to support numa node setting for bpf memory allocator. >>>>>> >>>>>> So adding support for setting numa node in bpf memory allocator and >>>>>> updating hash map accordingly. >>>>>> >>>>>> Fixes: fba1a1c6c912 ("bpf: Convert hash map to bpf_mem_alloc.") >>>>>> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> >>>>>> --- >> SNIP >>>> How about the following comments ? >>>> >>>> * For per-cpu allocator (percpu=true), the only valid value of numa_node is >>>> * NUMA_NO_NODE. For non-per-cpu allocator, if numa_node is NUMA_NO_NODE, the >>>> * preferred memory allocation node is the numa node where the allocating CPU >>>> * is located, else the preferred node is the specified numa_node. >>> No. This patch doesn't make sense to me. >>> As far as I can see it can only make things worse. >>> Why would you want a cpu to use non local memory? >> For pre-allocated hash table, the numa node setting is honored. And I think the >> reason is that there are bpf progs which are pinned on specific CPUs or numa >> nodes and accessing local memory will be good for performance. > prealloc happens at map creation time while > bpf prog might be running on completely different cpu, > so numa is necessary for prealloc. I see. So for non-preallocated hash map, the memory will allocated from the current NUMA node if possible and there will be no memory affinity problems if these programs are on the same NUMA node. > >> And in my >> understanding, the bpf memory allocator is trying to replace pre-allocated hash >> table to save memory, if the numa node setting is ignored, the above use cases >> may be work badly. Also I am trying to test whether or not there is visible >> performance improvement for the above assumed use case. > numa should be ignored, because we don't want users to accidently > pick wrong numa id. How about reject the NUMA node setting for non-preallocated hash table in hashtab.c ? > >>> The commit log: >>> " is that bpf memory allocator only supports NUMA_NO_NODE, but it seems it >>> is trivial to support numa node setting for bpf memory allocator." >>> got it wrong. >>> >>> See the existing comment: >>> /* irq_work runs on this cpu and kmalloc will allocate >>> * from the current numa node which is what we want here. >>> */ >>> alloc_bulk(c, c->batch, NUMA_NO_NODE);