On Fri, 2 Mar 2018 23:55:29 +0100 Daniel Vacek <neelx@xxxxxxxxxx> wrote: > On Fri, Mar 2, 2018 at 9:59 PM, <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > The patch titled > > Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment > > has been added to the -mm tree. Its filename is > > mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch > > > > This patch should soon appear at > > http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch > > and later at > > http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch > > > > Before you just go and hit "reply", please: > > a) Consider who else should be cc'ed > > b) Prefer to cc a suitable mailing list as well > > c) Ideally: find the original patch on the mailing list and do a > > reply-to-all to that, adding suitable additional cc's > > > > *** Remember to use Documentation/SubmitChecklist when testing your code *** > > Documentation/process/submit-checklist.rst nowadays, btw. thanks. > > The -mm tree is included into linux-next and is updated > > there every 3-4 working days > > Actually this should go directly to v4.16-rc4. Shall I cc Linus for v3 > I'm about to send? > Or do you think it's fine for -next and -stable and we keep it like this? I normally will sit on these things for a week or so before sending to Linus, to make sure they've had a cycle in -next and that reviewers have had time to consider the patch. Take a look in http://ozlabs.org/~akpm/mmots/series, at the "mainline-urgent" and "mainline-later" sections. Those are the patches which I'm planning on sending to Linus for this -rc cycle. On this particular patch I'd like to see some reviews and acks - I'm not terribly familiar with the zone initialization code. Please be aware that this code later gets altered by mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma.patch, which is below. I did some reject-fixing on this one. From: Eugeniu Rosca <erosca@xxxxxxxxxxxxxx> Subject: mm: page_alloc: skip over regions of invalid pfns on UMA As a result of bisecting the v4.10..v4.11 commit range, it was determined that commits [1] and [2] are both responsible of a ~140ms early startup improvement on Rcar-H3-ES20 arm64 platform. Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the rcar3 defconfig (which also reduces KNL binary image by ~64KB), but this is how the boot time improvement is lost. This patch makes optimization [2] available on UMA systems which provide support for CONFIG_HAVE_MEMBLOCK. Testing this change on Rcar H3-ES20-ULCB using v4.15-rc9 KNL and vanilla arm64 defconfig + NUMA=n, a speed-up of ~139ms (from ~174ms [3] to ~35ms [4]) is observed in the execution of memmap_init_zone(). No boot time improvement is sensed on Apollo Lake SoC. [1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING") [2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") [3] 174ms spent in memmap_init_zone() on H3ULCB w/o this patch (NUMA=n) [ 2.643685] On node 0 totalpages: 1015808 [ 2.643688] DMA zone: 3584 pages used for memmap [ 2.643691] DMA zone: 0 pages reserved [ 2.643693] DMA zone: 229376 pages, LIFO batch:31 [ 2.643696] > memmap_init_zone [ 2.663628] < memmap_init_zone (19.932 ms) [ 2.663632] Normal zone: 12288 pages used for memmap [ 2.663635] Normal zone: 786432 pages, LIFO batch:31 [ 2.663637] > memmap_init_zone [ 2.818012] < memmap_init_zone (154.375 ms) [ 2.818041] psci: probing for conduit method from DT. [4] 35ms spent in memmap_init_zone() on H3ULCB with this patch (NUMA=n) [ 2.677202] On node 0 totalpages: 1015808 [ 2.677205] DMA zone: 3584 pages used for memmap [ 2.677208] DMA zone: 0 pages reserved [ 2.677211] DMA zone: 229376 pages, LIFO batch:31 [ 2.677213] > memmap_init_zone [ 2.684378] < memmap_init_zone (7.165 ms) [ 2.684382] Normal zone: 12288 pages used for memmap [ 2.684385] Normal zone: 786432 pages, LIFO batch:31 [ 2.684387] > memmap_init_zone [ 2.712556] < memmap_init_zone (28.169 ms) [ 2.712584] psci: probing for conduit method from DT. [mhocko@xxxxxxxxxx: fix build] Link: http://lkml.kernel.org/r/20180222072037.GC30681@xxxxxxxxxxxxxx Link: http://lkml.kernel.org/r/20180217222846.29589-1-rosca.eugeniu@xxxxxxxxx Signed-off-by: Eugeniu Rosca <erosca@xxxxxxxxxxxxxx> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> Reported-by: Eugeniu Rosca <erosca@xxxxxxxxxxxxxx> Tested-by: Eugeniu Rosca <erosca@xxxxxxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/memblock.h | 3 +- mm/memblock.c | 54 ++++++++++++++++++------------------- mm/page_alloc.c | 2 - 3 files changed, 30 insertions(+), 29 deletions(-) diff -puN include/linux/memblock.h~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma include/linux/memblock.h --- a/include/linux/memblock.h~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma +++ a/include/linux/memblock.h @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned lon unsigned long *end_pfn); void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn, unsigned long *out_end_pfn, int *out_nid); -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn); /** * for_each_mem_pfn_range - early memory pfn range iterator @@ -204,6 +203,8 @@ unsigned long memblock_next_valid_pfn(un i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid)) #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ +unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn); + /** * for_each_free_mem_range - iterate through free memblock areas * @i: u64 used as loop variable diff -puN mm/memblock.c~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma mm/memblock.c --- a/mm/memblock.c~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma +++ a/mm/memblock.c @@ -1101,33 +1101,6 @@ void __init_memblock __next_mem_pfn_rang *out_nid = r->nid; } -unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) -{ - struct memblock_type *type = &memblock.memory; - unsigned int right = type->cnt; - unsigned int mid, left = 0; - phys_addr_t addr = PFN_PHYS(++pfn); - - do { - mid = (right + left) / 2; - - if (addr < type->regions[mid].base) - right = mid; - else if (addr >= (type->regions[mid].base + - type->regions[mid].size)) - left = mid + 1; - else { - /* addr is within the region, so pfn is valid */ - return pfn; - } - } while (left < right); - - if (right == type->cnt) - return -1UL; - else - return PHYS_PFN(type->regions[right].base); -} - /** * memblock_set_node - set node ID on memblock regions * @base: base of area to set node ID for @@ -1159,6 +1132,33 @@ int __init_memblock memblock_set_node(ph } #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ +unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) +{ + struct memblock_type *type = &memblock.memory; + unsigned int right = type->cnt; + unsigned int mid, left = 0; + phys_addr_t addr = PFN_PHYS(++pfn); + + do { + mid = (right + left) / 2; + + if (addr < type->regions[mid].base) + right = mid; + else if (addr >= (type->regions[mid].base + + type->regions[mid].size)) + left = mid + 1; + else { + /* addr is within the region, so pfn is valid */ + return pfn; + } + } while (left < right); + + if (right == type->cnt) + return -1UL; + else + return PHYS_PFN(type->regions[right].base); +} + static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, phys_addr_t align, phys_addr_t start, phys_addr_t end, int nid, ulong flags) diff -puN mm/page_alloc.c~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma mm/page_alloc.c --- a/mm/page_alloc.c~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma +++ a/mm/page_alloc.c @@ -5440,7 +5440,6 @@ void __meminit memmap_init_zone(unsigned goto not_early; if (!early_pfn_valid(pfn)) { -#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP /* * Skip to the pfn preceding the next valid one (or * end_pfn), such that we hit a valid pfn (or end_pfn) @@ -5450,6 +5449,7 @@ void __meminit memmap_init_zone(unsigned * the valid region but still depends on correct page * metadata. */ +#ifdef CONFIG_HAVE_MEMBLOCK pfn = (memblock_next_valid_pfn(pfn) & ~(pageblock_nr_pages-1)) - 1; #endif _