On Thu, Jan 18, 2024 at 11:14:12AM +0800, Huang Shijie wrote: > (0) We list the ARCHs which support the NUMA: > arm64, loongarch, powerpc, riscv, > sparc, mips, s390, x86, I do not understand this format, what are you saying here? Have you read the kernel documentation for how to write changelog texts? It doesn't say "list a bunch of things", it's a bit more descriptive. > > (1) Some ARCHs in (0) override the generic cpu_to_node(), such as: > sparc, mips, s390, x86. > > Since these ARCHs have their own cpu_to_node(), we do not care > about them. > > (2) The ARCHs enable NUMA and use the generic cpu_to_node. > From (0) and (1), we can know that four ARCHs support NUMA and > use the generic cpu_to_node: > arm64, loongarch, powerpc, riscv, > > The generic cpu_to_node depends on percpu "numa_node". > > (2.1) The loongarch sets "numa_node" in: > start_kernel --> smp_prepare_boot_cpu() > > (2.2) The arm64, powerpc, riscv set "numa_node" in: > start_kernel --> arch_call_rest_init() --> rest_init() > --> kernel_init() --> kernel_init_freeable() > --> smp_prepare_cpus() > > (2.3) The first place calling the cpu_to_node() is early_trace_init(): > start_kernel --> early_trace_init()--> __ring_buffer_alloc() > --> rb_allocate_cpu_buffer() > > (2.4) So it safe for loongarch. But for arm64, powerpc and riscv, > there are at least four places in the common code where > the cpu_to_node() is called before it is initialized: > a.) early_trace_init() in kernel/trace/trace.c > b.) sched_init() in kernel/sched/core.c > c.) init_sched_fair_class() in kernel/sched/fair.c > d.) workqueue_init_early() in kernel/workqueue.c > > (3) In order to fix the issue, the patch refactors the generic cpu_to_node: > (3.1) change cpu_to_node to function pointer, > and export it for kernel modules. > > (3.2) introduce _cpu_to_node() which is the original cpu_to_node(). > > (3.3) introduce smp_prepare_boot_cpu_start() to wrap the original > smp_prepare_boot_cpu(), and set cpu_to_node with > early_cpu_to_node which works fine for arm64, powerpc, > riscv and loongarch. > > (3.4) introduce smp_prepare_cpus_done() to wrap the original > smp_prepare_cpus(). > The "numa_node" is ready after smp_prepare_cpus(), > then set cpu_to_node with _cpu_to_node(). When you start listing different things in a changelog, that's a hint to the reviewer to say "please break this up" as patches need to do only one thing at a time. As I can't follow the above text at all, that's all the review comments I'm able to give here, sorry. But as-is, this isn't acceptable :( thanks, greg k-h