On 04.06.19 19:36, Robin Murphy wrote: > On 04/06/2019 07:56, David Hildenbrand wrote: >> On 03.06.19 23:41, Wei Yang wrote: >>> On Mon, May 27, 2019 at 01:11:45PM +0200, David Hildenbrand wrote: >>>> A proper arch_remove_memory() implementation is on its way, which also >>>> cleanly removes page tables in arch_add_memory() in case something goes >>>> wrong. >>> >>> Would this be better to understand? >>> >>> removes page tables created in arch_add_memory >> >> That's not what this sentence expresses. Have a look at >> arch_add_memory(), in case __add_pages() fails, the page tables are not >> removed. This will also be fixed by Anshuman in the same shot. >> >>> >>>> >>>> As we want to use arch_remove_memory() in case something goes wrong >>>> during memory hotplug after arch_add_memory() finished, let's add >>>> a temporary hack that is sufficient enough until we get a proper >>>> implementation that cleans up page table entries. >>>> >>>> We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up >>>> patches. >>>> >>>> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> >>>> Cc: Will Deacon <will.deacon@xxxxxxx> >>>> Cc: Mark Rutland <mark.rutland@xxxxxxx> >>>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>>> Cc: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> >>>> Cc: Chintan Pandya <cpandya@xxxxxxxxxxxxxx> >>>> Cc: Mike Rapoport <rppt@xxxxxxxxxxxxx> >>>> Cc: Jun Yao <yaojun8558363@xxxxxxxxx> >>>> Cc: Yu Zhao <yuzhao@xxxxxxxxxx> >>>> Cc: Robin Murphy <robin.murphy@xxxxxxx> >>>> Cc: Anshuman Khandual <anshuman.khandual@xxxxxxx> >>>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> >>>> --- >>>> arch/arm64/mm/mmu.c | 19 +++++++++++++++++++ >>>> 1 file changed, 19 insertions(+) >>>> >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c >>>> index a1bfc4413982..e569a543c384 100644 >>>> --- a/arch/arm64/mm/mmu.c >>>> +++ b/arch/arm64/mm/mmu.c >>>> @@ -1084,4 +1084,23 @@ int arch_add_memory(int nid, u64 start, u64 size, >>>> return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT, >>>> restrictions); >>>> } >>>> +#ifdef CONFIG_MEMORY_HOTREMOVE >>>> +void arch_remove_memory(int nid, u64 start, u64 size, >>>> + struct vmem_altmap *altmap) >>>> +{ >>>> + unsigned long start_pfn = start >> PAGE_SHIFT; >>>> + unsigned long nr_pages = size >> PAGE_SHIFT; >>>> + struct zone *zone; >>>> + >>>> + /* >>>> + * FIXME: Cleanup page tables (also in arch_add_memory() in case >>>> + * adding fails). Until then, this function should only be used >>>> + * during memory hotplug (adding memory), not for memory >>>> + * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be >>>> + * unlocked yet. >>>> + */ >>>> + zone = page_zone(pfn_to_page(start_pfn)); >>> >>> Compared with arch_remove_memory in x86. If altmap is not NULL, zone will be >>> retrieved from page related to altmap. Not sure why this is not the same? >> >> This is a minimal implementation, sufficient for this use case here. A >> full implementation is in the works. For now, this function will not be >> used with an altmap (ZONE_DEVICE is not esupported for arm64 yet). > > FWIW the other pieces of ZONE_DEVICE are now due to land in parallel, > but as long as we don't throw the ARCH_ENABLE_MEMORY_HOTREMOVE switch > then there should still be no issue. Besides, given that we should > consistently ignore the altmap everywhere at the moment, it may even > work out regardless. Thanks for the info. > > One thing stands out about the failure path thing, though - if > __add_pages() did fail, can it still be guaranteed to have initialised > the memmap such that page_zone() won't return nonsense? Last time I if __add_pages() fails, then arch_add_memory() fails and arch_remove_memory() will not be called in the context of this series. Only if it succeeded. > looked that was still a problem when removing memory which had been > successfully added, but never onlined (although I do know that > particular case was already being discussed at the time, and I've not > been paying the greatest attention since). Yes, that part is next on my list. It works but is ugly. The memory removal process should not care about zones at all. Slowly moving into the right direction :) > > Robin. > -- Thanks, David / dhildenb