On Wed, 2023-08-02 at 15:20 +0100, Jonathan Cameron wrote: > On Tue, 01 Aug 2023 23:55:37 -0600 > Vishal Verma <vishal.l.verma@xxxxxxxxx> wrote: > > > The MHP_MEMMAP_ON_MEMORY flag for hotplugged memory is restricted to > > 'memblock_size' chunks of memory being added. Adding a larger span of > > memory precludes memmap_on_memory semantics. > > > > For users of hotplug such as kmem, large amounts of memory might get > > added from the CXL subsystem. In some cases, this amount may exceed the > > available 'main memory' to store the memmap for the memory being added. > > In this case, it is useful to have a way to place the memmap on the > > memory being added, even if it means splitting the addition into > > memblock-sized chunks. > > > > Change add_memory_resource() to loop over memblock-sized chunks of > > memory if caller requested memmap_on_memory, and if other conditions for > > it are met. Teach try_remove_memory() to also expect that a memory > > range being removed might have been split up into memblock sized chunks, > > and to loop through those as needed. > > > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > Cc: David Hildenbrand <david@xxxxxxxxxx> > > Cc: Michal Hocko <mhocko@xxxxxxxx> > > Cc: Oscar Salvador <osalvador@xxxxxxx> > > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > > Cc: Dave Jiang <dave.jiang@xxxxxxxxx> > > Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > > Cc: Huang Ying <ying.huang@xxxxxxxxx> > > Suggested-by: David Hildenbrand <david@xxxxxxxxxx> > > Signed-off-by: Vishal Verma <vishal.l.verma@xxxxxxxxx> > > A couple of trivial comments inline. Hi Jonathan, Thanks for taking a look. > > > --- > > mm/memory_hotplug.c | 150 ++++++++++++++++++++++++++++++++-------------------- > > 1 file changed, 93 insertions(+), 57 deletions(-) > > > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > > index d282664f558e..cae03c8d4bbf 100644 > > --- a/mm/memory_hotplug.c > > +++ b/mm/memory_hotplug.c > > @@ -1383,6 +1383,44 @@ static bool mhp_supports_memmap_on_memory(unsigned long size) > > return arch_supports_memmap_on_memory(vmemmap_size); > > } > > > > +static int add_memory_create_devices(int nid, struct memory_group *group, > > + u64 start, u64 size, mhp_t mhp_flags) > > +{ > > + struct mhp_params params = { .pgprot = pgprot_mhp(PAGE_KERNEL) }; > > + struct vmem_altmap mhp_altmap = { > > + .base_pfn = PHYS_PFN(start), > > + .end_pfn = PHYS_PFN(start + size - 1), > > + }; > > + int ret; > > + > > + if ((mhp_flags & MHP_MEMMAP_ON_MEMORY)) { > > + mhp_altmap.free = memory_block_memmap_on_memory_pages(); > > + params.altmap = kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); > > + if (!params.altmap) > > + return -ENOMEM; > > + > > + memcpy(params.altmap, &mhp_altmap, sizeof(mhp_altmap)); > > + } > > + > > + /* call arch's memory hotadd */ > > + ret = arch_add_memory(nid, start, size, ¶ms); > > + if (ret < 0) > > + goto error; > > + > > + /* create memory block devices after memory was added */ > > + ret = create_memory_block_devices(start, size, params.altmap, group); > > + if (ret) { > > + arch_remove_memory(start, size, NULL); > > Maybe push this down to a second label? Yep will do. > <snip> > > + > > +static int __ref try_remove_memory(u64 start, u64 size) > > +{ > > + int ret, nid = NUMA_NO_NODE; > > I'm not overly keen to see the trivial rename of rc -> ret in here. > Just makes it ever so slightly harder to compare old code and new code. Yep - this was to work around the patches I was based on, which added both a ret and left the original rc [1]. Aneesh will stick to 'rc' so my next revision should sort this out naturally. [1]: https://lore.kernel.org/all/715042319ceb86016a4986862a82756e5629d725.camel@xxxxxxxxx/ >