+ mm-sparse-optimize-memmap-allocation-during-sparse_init.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm/sparse: optimize memmap allocation during sparse_init()
has been added to the -mm tree.  Its filename is
     mm-sparse-optimize-memmap-allocation-during-sparse_init.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-sparse-optimize-memmap-allocation-during-sparse_init.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-sparse-optimize-memmap-allocation-during-sparse_init.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Baoquan He <bhe@xxxxxxxxxx>
Subject: mm/sparse: optimize memmap allocation during sparse_init()

In sparse_init(), two temporary pointer arrays, usemap_map and map_map are
allocated with the size of NR_MEM_SECTIONS.  They are used to store each
memory section's usemap and mem map if marked as present.  With the help
of these two arrays, continuous memory chunk is allocated for usemap and
memmap for memory sections on one node.  This avoids too many memory
fragmentations.  Like below diagram, '1' indicates the present memory
section, '0' means absent one.  The number 'n' could be much smaller than
NR_MEM_SECTIONS on most of systems.

|1|1|1|1|0|0|0|0|1|1|0|0|...|1|0||1|0|...|1||0|1|...|0|
-------------------------------------------------------
 0 1 2 3         4 5         i   i+1     n-1   n

If we fail to populate the page tables to map one section's memmap, its
->section_mem_map will be cleared finally to indicate that it's not
present.  After use, these two arrays will be released at the end of
sparse_init().

In 4-level paging mode, each array costs 4M which can be ignorable.  While
in 5-level paging, they costs 256M each, 512M altogether.  Kdump kernel
Usually only reserves very few memory, e.g 256M.  So, even thouth they are
temporarily allocated, still not acceptable.

In fact, there's no need to allocate them with the size of
NR_MEM_SECTIONS.  Since the ->section_mem_map clearing has been deferred
to the last, the number of present memory sections are kept the same
during sparse_init() until we finally clear out the memory section's
->section_mem_map if its usemap or memmap is not correctly handled.  Thus
in the middle whenever for_each_present_section_nr() loop is taken, the
i-th present memory section is always the same one.

Here only allocate usemap_map and map_map with the size of
'nr_present_sections'.  For the i-th present memory section, install its
usemap and memmap to usemap_map[i] and mam_map[i] during allocation.  Then
in the last for_each_present_section_nr() loop which clears the failed
memory section's ->section_mem_map, fetch usemap and memmap from
usemap_map[] and map_map[] array and set them into mem_section[]
accordingly.

Link: http://lkml.kernel.org/r/20180628062857.29658-5-bhe@xxxxxxxxxx
Signed-off-by: Baoquan He <bhe@xxxxxxxxxx>
Reviewed-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Cc: Oscar Salvador <osalvador@xxxxxxx>
Cc: Oscar Salvador <osalvador@xxxxxxxxxxxxxxxxxx>
Cc: Pankaj Gupta <pagupta@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/sparse-vmemmap.c |    5 ++--
 mm/sparse.c         |   44 +++++++++++++++++++++++++++++++++---------
 2 files changed, 38 insertions(+), 11 deletions(-)

diff -puN mm/sparse.c~mm-sparse-optimize-memmap-allocation-during-sparse_init mm/sparse.c
--- a/mm/sparse.c~mm-sparse-optimize-memmap-allocation-during-sparse_init
+++ a/mm/sparse.c
@@ -381,6 +381,7 @@ static void __init sparse_early_usemaps_
 	unsigned long pnum;
 	unsigned long **usemap_map = (unsigned long **)data;
 	int size = usemap_size();
+	int nr_consumed_maps = 0;
 
 	usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nodeid),
 							  size * usemap_count);
@@ -392,9 +393,10 @@ static void __init sparse_early_usemaps_
 	for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
 		if (!present_section_nr(pnum))
 			continue;
-		usemap_map[pnum] = usemap;
+		usemap_map[nr_consumed_maps] = usemap;
 		usemap += size;
-		check_usemap_section_nr(nodeid, usemap_map[pnum]);
+		check_usemap_section_nr(nodeid, usemap_map[nr_consumed_maps]);
+		nr_consumed_maps++;
 	}
 }
 
@@ -419,29 +421,34 @@ void __init sparse_mem_maps_populate_nod
 	void *map;
 	unsigned long pnum;
 	unsigned long size = sizeof(struct page) * PAGES_PER_SECTION;
+	int nr_consumed_maps;
 
 	size = PAGE_ALIGN(size);
 	map = memblock_virt_alloc_try_nid_raw(size * map_count,
 					      PAGE_SIZE, __pa(MAX_DMA_ADDRESS),
 					      BOOTMEM_ALLOC_ACCESSIBLE, nodeid);
 	if (map) {
+		nr_consumed_maps = 0;
 		for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
 			if (!present_section_nr(pnum))
 				continue;
-			map_map[pnum] = map;
+			map_map[nr_consumed_maps] = map;
 			map += size;
+			nr_consumed_maps++;
 		}
 		return;
 	}
 
 	/* fallback */
+	nr_consumed_maps = 0;
 	for (pnum = pnum_begin; pnum < pnum_end; pnum++) {
 		struct mem_section *ms;
 
 		if (!present_section_nr(pnum))
 			continue;
-		map_map[pnum] = sparse_mem_map_populate(pnum, nodeid, NULL);
-		if (map_map[pnum])
+		map_map[nr_consumed_maps] =
+				sparse_mem_map_populate(pnum, nodeid, NULL);
+		if (map_map[nr_consumed_maps++])
 			continue;
 		ms = __nr_to_section(pnum);
 		pr_err("%s: sparsemem memory map backing failed some memory will not be available\n",
@@ -521,6 +528,7 @@ static void __init alloc_usemap_and_memm
 		/* new start, update count etc*/
 		nodeid_begin = nodeid;
 		pnum_begin = pnum;
+		data += map_count * data_unit_size;
 		map_count = 1;
 	}
 	/* ok, last chunk */
@@ -539,6 +547,7 @@ void __init sparse_init(void)
 	unsigned long *usemap;
 	unsigned long **usemap_map;
 	int size;
+	int nr_consumed_maps = 0;
 #ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
 	int size2;
 	struct page **map_map;
@@ -561,7 +570,7 @@ void __init sparse_init(void)
 	 * powerpc need to call sparse_init_one_section right after each
 	 * sparse_early_mem_map_alloc, so allocate usemap_map at first.
 	 */
-	size = sizeof(unsigned long *) * NR_MEM_SECTIONS;
+	size = sizeof(unsigned long *) * nr_present_sections;
 	usemap_map = memblock_virt_alloc(size, 0);
 	if (!usemap_map)
 		panic("can not allocate usemap_map\n");
@@ -570,7 +579,7 @@ void __init sparse_init(void)
 				sizeof(usemap_map[0]));
 
 #ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
-	size2 = sizeof(struct page *) * NR_MEM_SECTIONS;
+	size2 = sizeof(struct page *) * nr_present_sections;
 	map_map = memblock_virt_alloc(size2, 0);
 	if (!map_map)
 		panic("can not allocate map_map\n");
@@ -579,27 +588,44 @@ void __init sparse_init(void)
 				sizeof(map_map[0]));
 #endif
 
+	/* The numner of present sections stored in nr_present_sections
+	 * are kept the same since mem sections are marked as present in
+	 * memory_present(). In this for loop, we need check which sections
+	 * failed to allocate memmap or usemap, then clear its
+	 * ->section_mem_map accordingly. During this process, we need
+	 * increase 'nr_consumed_maps' whether its allocation of memmap
+	 * or usemap failed or not, so that after we handle the i-th
+	 * memory section, can get memmap and usemap of (i+1)-th section
+	 * correctly. */
 	for_each_present_section_nr(0, pnum) {
 		struct mem_section *ms;
+
+		if (nr_consumed_maps >= nr_present_sections) {
+			pr_err("nr_consumed_maps goes beyond nr_present_sections\n");
+			break;
+		}
 		ms = __nr_to_section(pnum);
-		usemap = usemap_map[pnum];
+		usemap = usemap_map[nr_consumed_maps];
 		if (!usemap) {
 			ms->section_mem_map = 0;
+			nr_consumed_maps++;
 			continue;
 		}
 
 #ifdef CONFIG_SPARSEMEM_ALLOC_MEM_MAP_TOGETHER
-		map = map_map[pnum];
+		map = map_map[nr_consumed_maps];
 #else
 		map = sparse_early_mem_map_alloc(pnum);
 #endif
 		if (!map) {
 			ms->section_mem_map = 0;
+			nr_consumed_maps++;
 			continue;
 		}
 
 		sparse_init_one_section(__nr_to_section(pnum), pnum, map,
 								usemap);
+		nr_consumed_maps++;
 	}
 
 	vmemmap_populate_print_last();
diff -puN mm/sparse-vmemmap.c~mm-sparse-optimize-memmap-allocation-during-sparse_init mm/sparse-vmemmap.c
--- a/mm/sparse-vmemmap.c~mm-sparse-optimize-memmap-allocation-during-sparse_init
+++ a/mm/sparse-vmemmap.c
@@ -281,6 +281,7 @@ void __init sparse_mem_maps_populate_nod
 	unsigned long pnum;
 	unsigned long size = sizeof(struct page) * PAGES_PER_SECTION;
 	void *vmemmap_buf_start;
+	int nr_consumed_maps = 0;
 
 	size = ALIGN(size, PMD_SIZE);
 	vmemmap_buf_start = __earlyonly_bootmem_alloc(nodeid, size * map_count,
@@ -295,8 +296,8 @@ void __init sparse_mem_maps_populate_nod
 		if (!present_section_nr(pnum))
 			continue;
 
-		map_map[pnum] = sparse_mem_map_populate(pnum, nodeid, NULL);
-		if (map_map[pnum])
+		map_map[nr_consumed_maps] = sparse_mem_map_populate(pnum, nodeid, NULL);
+		if (map_map[nr_consumed_maps++])
 			continue;
 		pr_err("%s: sparsemem memory map backing failed some memory will not be available\n",
 		       __func__);
_

Patches currently in -mm which might be from bhe@xxxxxxxxxx are

mm-sparse-add-a-static-variable-nr_present_sections.patch
mm-sparsemem-defer-the-ms-section_mem_map-clearing.patch
mm-sparse-add-a-new-parameter-data_unit_size-for-alloc_usemap_and_memmap.patch
mm-sparse-optimize-memmap-allocation-during-sparse_init.patch
mm-sparse-remove-config_sparsemem_alloc_mem_map_together.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux