On Fri, Jan 19, 2024 at 02:46:16PM +0800, Shijie Huang wrote: > > 在 2024/1/19 12:42, Yury Norov 写道: > > This adds another level of indirection, I think. Currently cpu_to_node > > is a simple inliner. After the patch it would be a real function with > > all the associate overhead. Can you share a bloat-o-meter output here? > #./scripts/bloat-o-meter vmlinux vmlinux.new > add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580) > Function old new delta > numa_update_cpu 148 244 +96 > > ...................................................................................................................................(to many to skip) > > Total: Before=32990130, After=32990710, chg +0.00% It's not only about text size, the indirect call also hurts performance > > > > Regardless, I don't think that the approach is correct. As per your > > description, some initialization functions erroneously call > > cpu_to_node() instead of early_cpu_to_node() which exists specifically > > for that case. > > > > If the above correct, it's clearly a caller problem, and the fix is to > > simply switch all those callers to use early version. > > It is easy to change to early_cpu_to_node() for sched_init(), > init_sched_fair_class() > > and workqueue_init_early(). These three places call the cpu_to_node() in the > __init function. > > > But it is a little hard to change the early_trace_init(), since it calls > cpu_to_node in the deep > > function stack: > > early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer() > > > For early_trace_init(), we need to change more code. > > > Anyway, If we think it is not a good idea to change the common code, I am > oaky too. Is there a fundamental reason to have early_cpu_to_node() at all? It seems that all the mappings are known by the end of setup_arch() and the initialization of numa_node can be moved earlier. > > I would also initialize the numa_node with NUMA_NO_NODE at declaration, > > so that if someone calls cpu_to_node() before the variable is properly > > initialized at runtime, he'll get NO_NODE, which is obviously an error. > > Even we set the numa_node with NUMA_NO_NODE, it does not always produce > error. > > Please see the alloc_pages_node(). > > > Thanks > > Huang Shijie > -- Sincerely yours, Mike.