On 17.02.25 10:29, Zhenhua Huang wrote:
On the arm64 platform with 4K base page config, SECTION_SIZE_BITS is set
to 27, making one section 128M. The related page struct which vmemmap
points to is 2M then.
Commit c1cc1552616d ("arm64: MMU initialisation") optimizes the
vmemmap to populate at the PMD section level which was suitable
initially since hot plug granule is always one section(128M). However,
commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
introduced a 2M(SUBSECTION_SIZE) hot plug granule, which disrupted the
existing arm64 assumptions.
The first problem is that if start or end is not aligned to a section
boundary, such as when a subsection is hot added, populating the entire
section is wasteful.
The Next problem is if we hotplug something that spans part of 128 MiB
section (subsections, let's call it memblock1), and then hotplug something
that spans another part of a 128 MiB section(subsections, let's call it
memblock2), and subsequently unplug memblock1, vmemmap_free() will clear
the entire PMD entry which also supports memblock2 even though memblock2
is still active.
Assuming hotplug/unplug sizes are guaranteed to be symmetric. Do the
fix similar to x86-64: populate to pages levels if start/end is not aligned
with section boundary.
Signed-off-by: Zhenhua Huang <quic_zhenhuah@xxxxxxxxxxx>
---
arch/arm64/mm/mmu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b4df5bc5b1b8..eec1666da368 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1178,7 +1178,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
{
WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));
- if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES))
+ if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) ||
+ (end - start < PAGES_PER_SECTION * sizeof(struct page)))
return vmemmap_populate_basepages(start, end, node, altmap);
else
return vmemmap_populate_hugepages(start, end, node, altmap);
Yes, this does mimic what x86 does. That handling does look weird, because it
doesn't care about any address alignments, only about the size, which is odd.
I wonder if we could do better and move this handling
into vmemmap_populate_hugepages(), where we already have a fallback
to vmemmap_populate_basepages().
Something like:
One thing that confuses me is the "altmap" handling in x86-64 code: in particular
why it is ignored in some cases. So that might need a bit of thought / double-checking.
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 01ea7c6df3036..57542313c0000 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1546,10 +1546,10 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
VM_BUG_ON(!PAGE_ALIGNED(start));
VM_BUG_ON(!PAGE_ALIGNED(end));
- if (end - start < PAGES_PER_SECTION * sizeof(struct page))
- err = vmemmap_populate_basepages(start, end, node, NULL);
- else if (boot_cpu_has(X86_FEATURE_PSE))
+ if (boot_cpu_has(X86_FEATURE_PSE))
err = vmemmap_populate_hugepages(start, end, node, altmap);
+ else
+ err = vmemmap_populate_basepages(start, end, node, NULL);
else if (altmap) {
pr_err_once("%s: no cpu support for altmap allocations\n",
__func__);
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index 3287ebadd167d..8b217265b25b1 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -300,6 +300,10 @@ int __weak __meminit vmemmap_check_pmd(pmd_t *pmd, int node,
return 0;
}
+/*
+ * Try to populate PMDs, but fallback to populating base pages when ranges
+ * would only partially cover a PMD.
+ */
int __meminit vmemmap_populate_hugepages(unsigned long start, unsigned long end,
int node, struct vmem_altmap *altmap)
{
@@ -313,6 +317,9 @@ int __meminit vmemmap_populate_hugepages(unsigned long start, unsigned long end,
for (addr = start; addr < end; addr = next) {
next = pmd_addr_end(addr, end);
+ if (!IS_ALIGNED(addr, PMD_SIZE) || !IS_ALIGNED(next, PMD_SIZE))
+ goto fallback;
+
pgd = vmemmap_pgd_populate(addr, node);
if (!pgd)
return -ENOMEM;
@@ -346,6 +353,7 @@ int __meminit vmemmap_populate_hugepages(unsigned long start, unsigned long end,
}
} else if (vmemmap_check_pmd(pmd, node, addr, next))
continue;
+fallback:
if (vmemmap_populate_basepages(addr, next, node, altmap))
return -ENOMEM;
}
--
Cheers,
David / dhildenb