On Thu, May 09, 2019 at 04:58:56PM +0200, David Hildenbrand wrote: >On 09.05.19 16:31, Wei Yang wrote: >> On Tue, May 07, 2019 at 08:38:00PM +0200, David Hildenbrand wrote: >>> Only memory to be added to the buddy and to be onlined/offlined by >>> user space using memory block devices needs (and should have!) memory >>> block devices. >>> >>> Factor out creation of memory block devices Create all devices after >>> arch_add_memory() succeeded. We can later drop the want_memblock parameter, >>> because it is now effectively stale. >>> >>> Only after memory block devices have been added, memory can be onlined >>> by user space. This implies, that memory is not visible to user space at >>> all before arch_add_memory() succeeded. >>> >>> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> >>> Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx> >>> Cc: David Hildenbrand <david@xxxxxxxxxx> >>> Cc: "mike.travis@xxxxxxx" <mike.travis@xxxxxxx> >>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>> Cc: Ingo Molnar <mingo@xxxxxxxxxx> >>> Cc: Andrew Banman <andrew.banman@xxxxxxx> >>> Cc: Oscar Salvador <osalvador@xxxxxxx> >>> Cc: Michal Hocko <mhocko@xxxxxxxx> >>> Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> >>> Cc: Qian Cai <cai@xxxxxx> >>> Cc: Wei Yang <richard.weiyang@xxxxxxxxx> >>> Cc: Arun KS <arunks@xxxxxxxxxxxxxx> >>> Cc: Mathieu Malaterre <malat@xxxxxxxxxx> >>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> >>> --- >>> drivers/base/memory.c | 70 ++++++++++++++++++++++++++---------------- >>> include/linux/memory.h | 2 +- >>> mm/memory_hotplug.c | 15 ++++----- >>> 3 files changed, 53 insertions(+), 34 deletions(-) >>> >>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c >>> index 6e0cb4fda179..862c202a18ca 100644 >>> --- a/drivers/base/memory.c >>> +++ b/drivers/base/memory.c >>> @@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr) >>> return 0; >>> } >>> >>> +static void unregister_memory(struct memory_block *memory) >>> +{ >>> + BUG_ON(memory->dev.bus != &memory_subsys); >>> + >>> + /* drop the ref. we got via find_memory_block() */ >>> + put_device(&memory->dev); >>> + device_unregister(&memory->dev); >>> +} >>> + >>> /* >>> - * need an interface for the VM to add new memory regions, >>> - * but without onlining it. >>> + * Create memory block devices for the given memory area. Start and size >>> + * have to be aligned to memory block granularity. Memory block devices >>> + * will be initialized as offline. >>> */ >>> -int hotplug_memory_register(int nid, struct mem_section *section) >>> +int hotplug_memory_register(unsigned long start, unsigned long size) >> >> One trivial suggestion about the function name. >> >> For memory_block device, sometimes we use the full name >> >> find_memory_block >> init_memory_block >> add_memory_block >> >> But sometimes we use *nick* name >> >> hotplug_memory_register >> register_memory >> unregister_memory >> >> This is a little bit confusion. >> >> Can we use one name convention here? > >We can just go for > >crate_memory_blocks() and free_memory_blocks(). Or do >you have better suggestions? s/crate/create/ Looks good to me. > >(I would actually even prefer "memory_block_devices", because memory >blocks have different meanins) > Agree with you, this comes to my mind sometime ago :-) >> >> [...] >> >>> /* >>> @@ -1106,6 +1100,13 @@ int __ref add_memory_resource(int nid, struct resource *res) >>> if (ret < 0) >>> goto error; >>> >>> + /* create memory block devices after memory was added */ >>> + ret = hotplug_memory_register(start, size); >>> + if (ret) { >>> + arch_remove_memory(nid, start, size, NULL); >> >> Functionally, it works I think. >> >> But arch_remove_memory() would remove pages from zone. At this point, we just >> allocate section/mmap for pages, the zones are empty and pages are not >> connected to zone. >> >> Function zone = page_zone(page); always gets zone #0, since pages->flags is 0 >> at this point. This is not exact. >> >> Would we add some comment to mention this? Or we need to clean up >> arch_remove_memory() to take out __remove_zone()? > >That is precisely what is on my list next (see cover letter).This is >already broken when memory that was never onlined is removed again. >So I am planning to fix that independently. > Sounds great :-) Hope you would cc me in the following series. > >-- > >Thanks, > >David / dhildenb -- Wei Yang Help you, Help me