On 17.02.20 11:33, Baoquan He wrote: > On 02/17/20 at 11:24am, David Hildenbrand wrote: >> On 17.02.20 11:13, Baoquan He wrote: >>> On 02/17/20 at 10:34am, Oscar Salvador wrote: >>>> On Mon, Feb 17, 2020 at 02:46:27PM +0900, kkabe@xxxxxxxxxxx wrote: >>>>> =========================================== >>>>> struct page * __meminit populate_section_memmap(unsigned long pfn, >>>>> unsigned long nr_pages, int nid, struct vmem_altmap *altmap) >>>>> { >>>>> struct page *page, *ret; >>>>> unsigned long memmap_size = sizeof(struct page) * PAGES_PER_SECTION; >>>>> >>>>> page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size)); >>>>> if (page) { >>>>> goto got_map_page; >>>>> } >>>>> pr_info("%s: alloc_pages() returned 0x%p (should be 0), reverting to vmalloc(memmap_size=%lu)\n", __func__, page, memmap_size); >>>>> BUG_ON(page != 0); >>>>> >>>>> ret = vmalloc(memmap_size); >>>>> pr_info("%s: vmalloc(%lu) returned 0x%p\n", __func__, memmap_size, ret); >>>>> if (ret) { >>>>> goto got_map_ptr; >>>>> } >>>>> >>>>> return NULL; >>>>> got_map_page: >>>>> ret = (struct page *)pfn_to_kaddr(page_to_pfn(page)); >>>>> pr_info("%s: allocated struct page *page=0x%p\n", __func__, page); >>>>> got_map_ptr: >>>>> >>>>> pr_info("%s: returning struct page * =0x%p\n", __func__, ret); >>>>> return ret; >>>>> } >>>> >>>> Could you please replace %p with %px. Wih the first, pointers are hashed so it is trickier >>>> to get an overview of the meaning. >>>> >>>> David could be right about ZONE_NORMAL vs ZONE_HIGHMEM. >>>> IIUC, default_kernel_zone_for_pfn and default_zone_for_pfn seem to only deal with >>>> (ZONE_DMA,ZONE_NORMAL] or ZONE_MOVABLE. >>> >>> Ah, I think you both have spotted the problem. >>> >>> In i386, if w/o momory hot add, normal memory will only include those >>> below 896M and they are added into normal zone. The left are added into >>> highmem zone. >>> >>> How this influence the page allocation? >>> >>> Very huge. As we know, in i386, normal memory can be accessed with >>> virt_to_phys, namely PAGE_OFFSET + phys. But highmem has to be accessed >>> with kmap. However, the later hot added memory are all put into normal >>> memmory, accessing into them will stump into vmalloc area, I would say. >>> >>> So, i386 doesn't support memory hot add well. Not sure if below change >>> can make it work normally. >>> >>> We can just adjus the hot adding code as we have done for boot memmory. >>> Iterate zone from highmem if allowed when hot add memory. >>> >>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >>> index 475d0d68a32c..1380392d9ef5 100644 >>> --- a/mm/memory_hotplug.c >>> +++ b/mm/memory_hotplug.c >>> @@ -716,7 +716,10 @@ static struct zone *default_kernel_zone_for_pfn(int nid, unsigned long start_pfn >>> struct pglist_data *pgdat = NODE_DATA(nid); >>> int zid; >>> >>> - for (zid = 0; zid <= ZONE_NORMAL; zid++) { >>> + for (zid = 0; zid < MAX_NR_ZONES; zid++) { >> >> ZONE_DEVICE? :/ > > Not sure if ZONE_DEVICE will be supported on 32 bit system. > > >> >>> + if (zid == ZONE_MOVABLE) >>> + continue; >>> + >>> struct zone *zone = &pgdat->node_zones[zid]; >>> >>> if (zone_intersects(zone, start_pfn, nr_pages)) >>> >>> >> >> What if somebody onlines memory from user space explicitly to the normal >> zone? We can trigger crashes? > > Seems the current i386 code doesn't support it. Unless we change that > too. If not reserving virtual address space, later added any memory has > to be highmem. > >> >> This doesn't look like it ever worked reliably, can we just disable >> memory hotplug in case we have PAE? (especially, as continued i386 >> support is questionable) > > This is not PAE, this is only HIGHMEM4G. > Ah, okay. Anyhow, highmem combined with hotplug seems to be in a questionable state. I'd vote for disabling it if possible. -- Thanks, David / dhildenb