Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes: > On 7/10/2023 2:11 PM, Huang, Ying wrote: >> Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes: >> >>> On my machine with below memory layout, and I can see it will take more >>> time to skip the larger memory hole (range: 0x100000000 - 0x1800000000) >>> when isolating free pages. So adding a new helper to skip the memory >>> hole rapidly, which can reduce the time consumed from about 70us to less >>> than 1us. >>> >>> [ 0.000000] Zone ranges: >>> [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] >>> [ 0.000000] DMA32 empty >>> [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] >> The memory hole is at the beginning of zone NORMAL? If so, should >> zone > > No, the memory hole range is 0x1000000000 - 0x1800000000, and the > normal zone is start from 0x100000000. > > I'm sorry I made a typo in the commit message, which confuses you. The > memory hole range should be: 0x1000000000 - 0x1800000000. I updated > the commit message to the following and addressed David's comment: Got it! Thanks for explanation! > " > Just like commit 9721fd82351d ("mm: compaction: skip memory hole rapidly > when isolating migratable pages"), I can see it will also take more > time to skip the larger memory hole (range: 0x1000000000 - 0x1800000000) > when isolating free pages on my machine with below memory layout. So > like commit 9721fd82351d, adding a new helper to skip the memory hole > rapidly, which can reduce the time consumed from about 70us to less > than 1us. LGTM. Reviewed-by: "Huang, Ying" <ying.huang@xxxxxxxxx> > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] > [ 0.000000] DMA32 empty > [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff] > [ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff] > [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] > [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] > [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] > [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] > [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] > [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] > [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] > [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff] > " > >> NORMAL start at 0x1800000000? And, the free pages will not be scanned >> there? Or my understanding were wrong. > >>> [ 0.000000] Movable zone start for each node >>> [ 0.000000] Early memory node ranges >>> [ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff] >>> [ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] >>> [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] >>> [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] >>> [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff] >>> >>> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> >>> --- >>> mm/compaction.c | 30 +++++++++++++++++++++++++++++- >>> 1 file changed, 29 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/compaction.c b/mm/compaction.c >>> index 43358efdbdc2..9641e2131901 100644 >>> --- a/mm/compaction.c >>> +++ b/mm/compaction.c >>> @@ -249,11 +249,31 @@ static unsigned long skip_offline_sections(unsigned long start_pfn) >>> return 0; >>> } >>> + >>> +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) >>> +{ >>> + unsigned long start_nr = pfn_to_section_nr(start_pfn); >>> + >>> + if (!start_nr || online_section_nr(start_nr)) >>> + return 0; >>> + >>> + while (start_nr-- > 0) { >>> + if (online_section_nr(start_nr)) >>> + return section_nr_to_pfn(start_nr) + PAGES_PER_SECTION - 1; >>> + } >>> + >>> + return 0; >>> +} >>> #else >>> static unsigned long skip_offline_sections(unsigned long start_pfn) >>> { >>> return 0; >>> } >>> + >>> +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) >>> +{ >>> + return 0; >>> +} >>> #endif >>> /* >>> @@ -1668,8 +1688,16 @@ static void isolate_freepages(struct compact_control *cc) >>> page = pageblock_pfn_to_page(block_start_pfn, >>> block_end_pfn, >>> zone); >>> - if (!page) >>> + if (!page) { >>> + unsigned long next_pfn; >>> + >>> + next_pfn = skip_offline_sections_reverse(block_start_pfn); >>> + if (next_pfn) >>> + block_start_pfn = max(pageblock_start_pfn(next_pfn), >>> + low_pfn); >>> + >>> continue; >>> + } >>> /* Check the block is suitable for migration */ >>> if (!suitable_migration_target(cc, page))