On Mon, Feb 24, 2025 at 01:31:31AM +0000, Wei Yang wrote: > On Wed, Feb 19, 2025 at 09:24:31AM +0200, Mike Rapoport wrote: > >Hi, > > > >On Tue, Feb 18, 2025 at 03:50:04PM +0000, Wei Yang wrote: > >> On Thu, Feb 06, 2025 at 03:27:42PM +0200, Mike Rapoport wrote: > >> >From: "Mike Rapoport (Microsoft)" <rppt@xxxxxxxxxx> > >> > > >> >to denote areas that were reserved for kernel use either directly with > >> >memblock_reserve_kern() or via memblock allocations. > >> > > >> >Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx> > >> >--- > >> > include/linux/memblock.h | 16 +++++++++++++++- > >> > mm/memblock.c | 32 ++++++++++++++++++++++++-------- > >> > 2 files changed, 39 insertions(+), 9 deletions(-) > >> > > >> >diff --git a/include/linux/memblock.h b/include/linux/memblock.h > >> >index e79eb6ac516f..65e274550f5d 100644 > >> >--- a/include/linux/memblock.h > >> >+++ b/include/linux/memblock.h > >> >@@ -50,6 +50,7 @@ enum memblock_flags { > >> > MEMBLOCK_NOMAP = 0x4, /* don't add to kernel direct mapping */ > >> > MEMBLOCK_DRIVER_MANAGED = 0x8, /* always detected via a driver */ > >> > MEMBLOCK_RSRV_NOINIT = 0x10, /* don't initialize struct pages */ > >> >+ MEMBLOCK_RSRV_KERN = 0x20, /* memory reserved for kernel use */ > >> > >> Above memblock_flags, there are comments on explaining those flags. > >> > >> Seems we miss it for MEMBLOCK_RSRV_KERN. > > > >Right, thanks! > > > >> > > >> > #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP > >> >@@ -1459,14 +1460,14 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, > >> > again: > >> > found = memblock_find_in_range_node(size, align, start, end, nid, > >> > flags); > >> >- if (found && !memblock_reserve(found, size)) > >> >+ if (found && !__memblock_reserve(found, size, nid, MEMBLOCK_RSRV_KERN)) > >> > >> Maybe we could use memblock_reserve_kern() directly. If my understanding is > >> correct, the reserved region's nid is not used. > > > >We use nid of reserved regions in reserve_bootmem_region() (commit > >61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")) but KHO needs to > >know the distribution of reserved memory among the nodes before > >memmap_init_reserved_pages(). > > > >> BTW, one question here. How we handle concurrent memblock allocation? If two > >> threads find the same available range and do the reservation, it seems to be a > >> problem to me. Or I missed something? > > > >memblock allocations end before smp_init(), there is no possible concurrency. > > > > Thanks, I still have one question here. > > Below is a simplified call flow. > > mm_core_init() > mem_init() > memblock_free_all() > free_low_memory_core_early() > memmap_init_reserved_pages() > memblock_set_node(..., memblock.reserved, ) --- (1) > __free_memory_core() > kmem_cache_init() > slab_state = UP; --- (2) > > And memblock_allloc_range_nid() is not supposed to be called after > slab_is_available(). Even someone do dose it, it will get memory from slab > instead of reserve region in memblock. > > From the above call flow and background, there are three cases when > memblock_alloc_range_nid() would be called: > > * If it is called before (1), memblock.reserved's nid would be adjusted correctly. > * If it is called after (2), we don't touch memblock.reserved. > * If it happens between (1) and (2), it looks would break the consistency of > nid information in memblock.reserved. Because when we use > memblock_reserve_kern(), NUMA_NO_NODE would be stored in region. > > So my question is if the third case happens, would it introduce a bug? If it > won't happen, seems we don't need to specify the nid here? We don't really care about proper assignment of nodes between (1) and (2) from one side and the third case does not happen on the other side. Nothing should call membloc_alloc() after memblock_free_all(). But it's easy to make the window between (1) and (2) disappear by replacing checks for slab_is_available() in memblock with a variable local to memblock. > -- > Wei Yang > Help you, Help me -- Sincerely yours, Mike.