On 11.07.23 17:40, Aneesh Kumar K V wrote:
On 7/11/23 8:56 PM, David Hildenbrand wrote:
On 11.07.23 06:48, Aneesh Kumar K.V wrote:
Radix vmemmap mapping can map things correctly at the PMD level or PTE
level based on different device boundary checks. Hence we skip the
restrictions w.r.t vmemmap size to be multiple of PMD_SIZE. This also
makes the feature widely useful because to use PMD_SIZE vmemmap area we
require a memory block size of 2GiB
We can also use MHP_RESERVE_PAGES_MEMMAP_ON_MEMORY to that the feature
can work with a memory block size of 256MB. Using altmap.reserve feature
to align things correctly at pageblock granularity. We can end up
losing some pages in memory with this. For ex: with a 256MiB memory block
size, we require 4 pages to map vmemmap pages, In order to align things
correctly we end up adding a reserve of 28 pages. ie, for every 4096
pages 28 pages get reserved.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/pgtable.h | 28 +++++++++++++++++++
.../platforms/pseries/hotplug-memory.c | 3 +-
mm/memory_hotplug.c | 2 ++
4 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 116d6add0bb0..f890907e5bbf 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -157,6 +157,7 @@ config PPC
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_KEEP_MEMBLOCK
+ select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE if PPC_RADIX_MMU
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 68817ea7f994..8e6c92dde6ad 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -169,6 +169,34 @@ static inline bool is_ioremap_addr(const void *x)
int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_map_size);
bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long start,
unsigned long page_size);
+/*
+ * mm/memory_hotplug.c:mhp_supports_memmap_on_memory goes into details
+ * some of the restrictions. We don't check for PMD_SIZE because our
+ * vmemmap allocation code can fallback correctly. The pageblock
+ * alignment requirement is met using altmap->reserve blocks.
+ */
+#define arch_supports_memmap_on_memory arch_supports_memmap_on_memory
+static inline bool arch_supports_memmap_on_memory(unsigned long size)
+{
+ unsigned long nr_pages = size >> PAGE_SHIFT;
+ unsigned long vmemmap_size = nr_pages * sizeof(struct page);
+
+ if (!radix_enabled())
+ return false;
+
+#ifdef CONFIG_PPC_4K_PAGES
+ return IS_ALIGNED(vmemmap_size, PMD_SIZE);
+#else
+ /*
+ * Make sure the vmemmap allocation is fully contianed
+ * so that we always allocate vmemmap memory from altmap area.
+ * The pageblock alignment requirement is met by using
+ * reserve blocks in altmap.
+ */
+ return IS_ALIGNED(vmemmap_size, PAGE_SIZE);
Can we move that check into common code as well?
If our (original) vmemmap size would not fit into a single page, we would be in trouble on any architecture. Did not check if it would be an issue for arm64 as well in case we would allow eventually wasting memory.
For x86 and arm we already do IS_ALIGNED(vmemmap_size, PMD_SIZE); in arch_supports_memmap_on_memory(). That should imply PAGE_SIZE alignment.
If arm64 allow the usage of altmap.reserve, I would expect the arch_supports_memmap_on_memory to have the PAGE_SIZE check.
Adding the PAGE_SIZE check in mhp_supports_memmap_on_memory() makes it redundant check for x86 and arm currently?
IMHO not an issue. The common code check is a bit weaker and the arch
check a bit stronger.
modified mm/memory_hotplug.c
@@ -1293,6 +1293,13 @@ static bool mhp_supports_memmap_on_memory(unsigned long size)
*/
if (!mhp_memmap_on_memory() || size != memory_block_size_bytes())
return false;
+
+ /*
+ * Make sure the vmemmap allocation is fully contianed
s/contianed/contained/
+ * so that we always allocate vmemmap memory from altmap area.
In theory, it's not only the vmemmap size, but also the vmemmap start
(that it doesn't start somewhere in between a page, crossing a page). I
suspect the start is always guaranteed to be aligned (of the vmemmap
size is aligned), correct?
+ */
+ if (!IS_ALIGNED(vmemmap_size, PAGE_SIZE))
+ return false;
/*
* Without page reservation remaining pages should be pageblock aligned.
*/
--
Cheers,
David / dhildenb