On 7/11/24 8:04 PM, Christoph Lameter (Ampere) wrote: > On Thu, 11 Jul 2024, Vlastimil Babka wrote: > >>> There are also the cpuset/cgroup restrictions via the zonelists that are >>> bypassed by removing alloc_pages() >> >> AFAICS cpusets are handled on a level that's reached by both paths, i.e. >> prepare_alloc_pages(), and I see nothing that would make switching to >> alloc_pages_node() bypass it. Am I missing something? > > You are correct. cpuset/cgroup restrictions also apply to > alloc_pages_node(). > >>> We have some internal patches now that implement memory policies on a per >>> object basis for SLUB here. >>> >>> This is a 10-15% regression on various benchmarks when objects like the >>> scheduler statistics structures are misplaced. >> >> I believe it would be best if you submitted a patch with with all that >> reasoning. Thanks! I still believe that :) > Turns out those performance issues are related to the issue that NUMA > locality is only considered at the folio level for slab allocation. > Individual object allocations are not subject to it. > > The performance issue comes about in the following way: > > Two kernel threads run on the same cpu using the same slab cache. One of > them keeps on allocating from a different node via kmalloc_node() and the > other is using kmalloc(). Then the kmalloc_node() thread will always > ensure that the per cpu slab is from the other node. > > The other thread will use kmalloc() which does not check which node the > per cpu slab is from. Therefore the kmallc thread can continually be > served objects that are not local. That is not good and causes > misplacement of objects. > > But that issue is something separate from this commit here and we see the > same regression before this commit. > > This patch still needs to be reverted since the rationale for the patch > is not right and it disables memory policy support. Results in the > strange situation that memory policies are used in get_any_partial() in > slub but not during allocation anymore. >