On Fri, Aug 02, 2024 at 05:07:48PM -0700, Nathan Chancellor wrote: >Hi Wei, > >On Fri, Jul 26, 2024 at 12:36:12AM +0000, Wei Yang wrote: >> Total memory represents pages managed by buddy system. After the >> introduction of DEFERRED_STRUCT_PAGE_INIT, it may count the pages before >> being managed. >> >> free_low_memory_core_early() returns number of pages for all free pages, >> even at this moment only early initialized pages are freed to buddy >> system. This means the total memory at this moment is not correct. >> >> Let's increase it when pages are freed to buddy system. >> >> Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx> >> CC: David Hildenbrand <david@xxxxxxxxxx> >> >> --- >> v2: >> * rebase on current master >> * those places would be affected are merged >> --- >> mm/memblock.c | 22 ++++++---------------- >> mm/page_alloc.c | 4 +--- >> 2 files changed, 7 insertions(+), 19 deletions(-) >> >> diff --git a/mm/memblock.c b/mm/memblock.c >> index 213057603b65..592a22b64682 100644 >> --- a/mm/memblock.c >> +++ b/mm/memblock.c >> @@ -1711,10 +1711,8 @@ void __init memblock_free_late(phys_addr_t base, phys_addr_t size) >> cursor = PFN_UP(base); >> end = PFN_DOWN(base + size); >> >> - for (; cursor < end; cursor++) { >> + for (; cursor < end; cursor++) >> memblock_free_pages(pfn_to_page(cursor), cursor, 0); >> - totalram_pages_inc(); >> - } >> } >> >> /* >> @@ -2140,7 +2138,7 @@ static void __init __free_pages_memory(unsigned long start, unsigned long end) >> } >> } >> >> -static unsigned long __init __free_memory_core(phys_addr_t start, >> +static void __init __free_memory_core(phys_addr_t start, >> phys_addr_t end) >> { >> unsigned long start_pfn = PFN_UP(start); >> @@ -2148,11 +2146,9 @@ static unsigned long __init __free_memory_core(phys_addr_t start, >> PFN_DOWN(end), max_low_pfn); >> >> if (start_pfn >= end_pfn) >> - return 0; >> + return; >> >> __free_pages_memory(start_pfn, end_pfn); >> - >> - return end_pfn - start_pfn; >> } >> >> static void __init memmap_init_reserved_pages(void) >> @@ -2194,9 +2190,8 @@ static void __init memmap_init_reserved_pages(void) >> } >> } >> >> -static unsigned long __init free_low_memory_core_early(void) >> +static void __init free_low_memory_core_early(void) >> { >> - unsigned long count = 0; >> phys_addr_t start, end; >> u64 i; >> >> @@ -2211,9 +2206,7 @@ static unsigned long __init free_low_memory_core_early(void) >> */ >> for_each_free_mem_range(i, NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end, >> NULL) >> - count += __free_memory_core(start, end); >> - >> - return count; >> + __free_memory_core(start, end); >> } >> >> static int reset_managed_pages_done __initdata; >> @@ -2244,13 +2237,10 @@ void __init reset_all_zones_managed_pages(void) >> */ >> void __init memblock_free_all(void) >> { >> - unsigned long pages; >> - >> free_unused_memmap(); >> reset_all_zones_managed_pages(); >> >> - pages = free_low_memory_core_early(); >> - totalram_pages_add(pages); >> + free_low_memory_core_early(); >> } >> >> /* Keep a table to reserve named memory */ >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 71d2716a554f..4701bc442df6 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -1248,16 +1248,14 @@ void __meminit __free_pages_core(struct page *page, unsigned int order, >> * map it first. >> */ >> debug_pagealloc_map_pages(page, nr_pages); >> - adjust_managed_page_count(page, nr_pages); >> } else { >> for (loop = 0; loop < nr_pages; loop++, p++) { >> __ClearPageReserved(p); >> set_page_count(p, 0); >> } >> >> - /* memblock adjusts totalram_pages() manually. */ >> - atomic_long_add(nr_pages, &page_zone(page)->managed_pages); >> } >> + adjust_managed_page_count(page, nr_pages); >> >> if (page_contains_unaccepted(page, order)) { >> if (order == MAX_PAGE_ORDER && __free_unaccepted(page)) >> -- >> 2.34.1 >> >> > >After this change as commit 0e690b558b53 ("mm: increase totalram_pages >on freeing to buddy system") in -next, I see an issue when booting >OpenSUSE's powerpc64le configuration in QEMU (I have not tried to see if >there is a specific configuration option triggers this yet but it does >not happen with all of my powerpc configurations): > >$ curl -LSso .config https://github.com/openSUSE/kernel-source/raw/master/config/ppc64le/default > >$ make -skj"$(nproc)" ARCH=powerpc CROSS_COMPILE=powerpc64-linux- olddefconfig zImage.epapr > >$ qemu-system-ppc64 \ > -display none \ > -nodefaults \ > -device ipmi-bmc-sim,id=bmc0 \ > -device isa-ipmi-bt,bmc=bmc0,irq=10 \ > -machine powernv \ > -kernel arch/powerpc/boot/zImage.epapr \ > -initrd rootfs.cpio \ > -m 2G \ > -serial mon:stdio Hi, Nathan Thanks for testing. After some debug, the broken point is in mm/shmem.c. Function shmem_default_max_blocks() / shmem_default_max_inodex() is called by shmem_fill_super() during early stage. But I can't get the total free pages from memblock here as those functions will be called later, when memblock is discarded. I may need to find other way to handle it. -- Wei Yang Help you, Help me