On 03/28/24 at 11:53am, Mike Rapoport wrote: > On Thu, Mar 28, 2024 at 04:32:38PM +0800, Baoquan He wrote: > > On 03/25/24 at 10:56pm, Baoquan He wrote: > > > > > > /* > > > - * Set an approximate value for lowmem here, it will be adjusted > > > - * when the bootmem allocator frees pages into the buddy system. > > > - * And all highmem pages will be managed by the buddy system. > > > + * Initialize zone->managed_pages as 0 , it will be reset > > > + * when memblock allocator frees pages into buddy system. > > > */ > > > - zone_init_internals(zone, j, nid, freesize); > > > + zone_init_internals(zone, j, nid, 0); > > > > Here, we should initialize zone->managed_pages as zone->present_pages > > because later page_group_by_mobility_disabled need be set according to > > zone->managed_pages. Otherwise page_group_by_mobility_disabled will be > > set to 1 always. I will sent out v3. > > With zone->managed_pages set to zone->present_pages we won't account for > the reserved memory for initialization of page_group_by_mobility_disabled. The old zone->managed_pages didn't account for the reserved pages either. It's calculated by (zone->present_pages - memmap_pages). memmap pages only is only a very small portion, e.g on x86_64, 4K page size, assuming size of struct page is 64, then it's 1/64 of system memory. On arm64, 64K page size, it's 1/1024 of system memory. And about the setting of page_group_by_mobility_disabled, the compared value pageblock_nr_pages * MIGRATE_TYPES which is very small. On x86_64, it's 4M*6=24M; on arm64 with 64K size and 128M*6=768M which should be the biggest among ARCH-es. if (vm_total_pages < (pageblock_nr_pages * MIGRATE_TYPES)) page_group_by_mobility_disabled = 1; else page_group_by_mobility_disabled = 0; So page_group_by_mobility_disabled could be set to 1 only on system with very little memory which is very rarely seen. And setting zone->managed_pages as zone->present_pages is very close to its old value: (zone->present_pages - memmap_pages). Here we don't need be very accurate, just a rough value. > > As watermarks are still not initialized at the time build_all_zonelists() > is called, we may use nr_all_pages - nr_kernel_pages instead of > nr_free_zone_pages(), IMO. nr_all_pages should be fine if we take this way. nr_kernel_pages is a misleading name, it's all low memory pages excluding kernel reserved apges. nr_all_pages is all memory pages including highmema and exluding kernel reserved pages. Both is fine to me. The first one is easier, simply setting zone->managed_pages as zone->present_pages. The 2nd way is a little more accurate. > > > From a17b0921b4bd00596330f61ee9ea4b82386a9fed Mon Sep 17 00:00:00 2001 > > From: Baoquan He <bhe@xxxxxxxxxx> > > Date: Thu, 28 Mar 2024 16:20:15 +0800 > > Subject: [PATCH] mm/mm_init.c: set zone's ->managed_pages as ->present_pages > > for now > > Content-type: text/plain > > > > Because page_group_by_mobility_disabled need be set according to zone's > > managed_pages later. > > > > Signed-off-by: Baoquan He <bhe@xxxxxxxxxx> > > --- > > mm/mm_init.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/mm_init.c b/mm/mm_init.c > > index cc24e7958c0c..dd875f943cbb 100644 > > --- a/mm/mm_init.c > > +++ b/mm/mm_init.c > > @@ -1561,7 +1561,7 @@ static void __init free_area_init_core(struct pglist_data *pgdat) > > * Initialize zone->managed_pages as 0 , it will be reset > > * when memblock allocator frees pages into buddy system. > > */ > > - zone_init_internals(zone, j, nid, 0); > > + zone_init_internals(zone, j, nid, zone->present_pages); > > > > if (!size) > > continue; > > -- > > 2.41.0 > > > > > > > > > > if (!size) > > > continue; > > > @@ -1915,6 +1878,7 @@ void __init free_area_init(unsigned long *max_zone_pfn) > > > check_for_memory(pgdat); > > > } > > > > > > + calc_nr_kernel_pages(); > > > memmap_init(); > > > > > > /* disable hash distribution for systems with a single node */ > > > -- > > > 2.41.0 > > > > > > > -- > Sincerely yours, > Mike. >