On 02/17/20 at 10:34am, Oscar Salvador wrote: > On Mon, Feb 17, 2020 at 02:46:27PM +0900, kkabe@xxxxxxxxxxx wrote: > > =========================================== > > struct page * __meminit populate_section_memmap(unsigned long pfn, > > unsigned long nr_pages, int nid, struct vmem_altmap *altmap) > > { > > struct page *page, *ret; > > unsigned long memmap_size = sizeof(struct page) * PAGES_PER_SECTION; > > > > page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size)); > > if (page) { > > goto got_map_page; > > } > > pr_info("%s: alloc_pages() returned 0x%p (should be 0), reverting to vmalloc(memmap_size=%lu)\n", __func__, page, memmap_size); > > BUG_ON(page != 0); > > > > ret = vmalloc(memmap_size); > > pr_info("%s: vmalloc(%lu) returned 0x%p\n", __func__, memmap_size, ret); > > if (ret) { > > goto got_map_ptr; > > } > > > > return NULL; > > got_map_page: > > ret = (struct page *)pfn_to_kaddr(page_to_pfn(page)); > > pr_info("%s: allocated struct page *page=0x%p\n", __func__, page); > > got_map_ptr: > > > > pr_info("%s: returning struct page * =0x%p\n", __func__, ret); > > return ret; > > } > > Could you please replace %p with %px. Wih the first, pointers are hashed so it is trickier > to get an overview of the meaning. > > David could be right about ZONE_NORMAL vs ZONE_HIGHMEM. > IIUC, default_kernel_zone_for_pfn and default_zone_for_pfn seem to only deal with > (ZONE_DMA,ZONE_NORMAL] or ZONE_MOVABLE. Ah, I think you both have spotted the problem. In i386, if w/o momory hot add, normal memory will only include those below 896M and they are added into normal zone. The left are added into highmem zone. How this influence the page allocation? Very huge. As we know, in i386, normal memory can be accessed with virt_to_phys, namely PAGE_OFFSET + phys. But highmem has to be accessed with kmap. However, the later hot added memory are all put into normal memmory, accessing into them will stump into vmalloc area, I would say. So, i386 doesn't support memory hot add well. Not sure if below change can make it work normally. We can just adjus the hot adding code as we have done for boot memmory. Iterate zone from highmem if allowed when hot add memory. diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 475d0d68a32c..1380392d9ef5 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -716,7 +716,10 @@ static struct zone *default_kernel_zone_for_pfn(int nid, unsigned long start_pfn struct pglist_data *pgdat = NODE_DATA(nid); int zid; - for (zid = 0; zid <= ZONE_NORMAL; zid++) { + for (zid = 0; zid < MAX_NR_ZONES; zid++) { + if (zid == ZONE_MOVABLE) + continue; + struct zone *zone = &pgdat->node_zones[zid]; if (zone_intersects(zone, start_pfn, nr_pages))