On Wed, Jul 17, 2024 at 04:42:48PM +0200, David Hildenbrand wrote: > On 16.07.24 13:13, Mike Rapoport wrote: > > From: "Mike Rapoport (Microsoft)" <rppt@xxxxxxxxxx> > > > > Architectures that support NUMA duplicate the code that allocates > > NODE_DATA on the node-local memory with slight variations in reporting > > of the addresses where the memory was allocated. > > > > Use x86 version as the basis for the generic alloc_node_data() function > > and call this function in architecture specific numa initialization. > > > > Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx> > > --- > > [...] > > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c > > index 9208eaadf690..909f6cec3a26 100644 > > --- a/arch/mips/loongson64/numa.c > > +++ b/arch/mips/loongson64/numa.c > > @@ -81,12 +81,8 @@ static void __init init_topology_matrix(void) > > static void __init node_mem_init(unsigned int node) > > { > > - struct pglist_data *nd; > > unsigned long node_addrspace_offset; > > unsigned long start_pfn, end_pfn; > > - unsigned long nd_pa; > > - int tnid; > > - const size_t nd_size = roundup(sizeof(pg_data_t), SMP_CACHE_BYTES); > > One interesting change is that we now always round up to full pages on > architectures where we previously rounded up to SMP_CACHE_BYTES. On my workstation struct pglist_data take 174400, cachelines: 2725, members: 43 */ > I assume we don't really expect a significant growth in memory consumption > that we care about, especially because most systems with many nodes also > have quite some memory around. With Debian kernel configuration for 6.5 struct pglist data takes 174400 bytes so the increase here is below 1%. For NUMA systems with a lot of nodes that shouldn't be a problem. > > -/* Allocate NODE_DATA for a node on the local memory */ > > -static void __init alloc_node_data(int nid) > > -{ > > - const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE); > > - u64 nd_pa; > > - void *nd; > > - int tnid; > > - > > - /* > > - * Allocate node data. Try node-local memory and then any node. > > - * Never allocate in DMA zone. > > - */ > > - nd_pa = memblock_phys_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid); > > - if (!nd_pa) { > > - pr_err("Cannot find %zu bytes in any node (initial node: %d)\n", > > - nd_size, nid); > > - return; > > - } > > - nd = __va(nd_pa); > > - > > - /* report and initialize */ > > - printk(KERN_INFO "NODE_DATA(%d) allocated [mem %#010Lx-%#010Lx]\n", nid, > > - nd_pa, nd_pa + nd_size - 1); > > - tnid = early_pfn_to_nid(nd_pa >> PAGE_SHIFT); > > - if (tnid != nid) > > - printk(KERN_INFO " NODE_DATA(%d) on node %d\n", nid, tnid); > > - > > - node_data[nid] = nd; > > - memset(NODE_DATA(nid), 0, sizeof(pg_data_t)); > > - > > - node_set_online(nid); > > -} > > - > > /** > > * numa_cleanup_meminfo - Cleanup a numa_meminfo > > * @mi: numa_meminfo to clean up > > @@ -571,6 +538,7 @@ static int __init numa_register_memblks(struct numa_meminfo *mi) > > continue; > > alloc_node_data(nid); > > + node_set_online(nid); > > } > > I can spot that we only remove a single node_set_online() call from x86. > > What about all the other architectures? Will there be any change in behavior > for them? Or do we simply set the nodes online later once more? On x86 node_set_online() was a part of alloc_node_data() and I moved it outside so it's called right after alloc_node_data(). On other architectures the allocation didn't include that call, so there should be no difference there. > -- > Cheers, > > David / dhildenb > > -- Sincerely yours, Mike.