On Thu, May 13, 2021 at 10:18:51PM +0200, Uladzislau Rezki wrote: > <SNIP> > > From the trace i get: > > <snip> > [ 0.243916] RIP: 0010:__alloc_pages+0x11e/0x310 > [ 0.243916] Code: 84 c0 0f 85 02 01 00 00 89 d8 48 8b 54 24 08 8b 74 24 1c c1 e8 0c 83 e0 01 88 44 24 20 48 8b 04 24 48 85 d2 0f 85 71 01 00 00 <3b> 70 08 0f 82 68 01 00 00 48 89 44 24 10 48 8b 00 89 da 81 e2 00 > [ 0.243916] RSP: 0000:ffffffffae803c38 EFLAGS: 00010246 > [ 0.243916] RAX: 0000000000001cc0 RBX: 0000000000002102 RCX: 0000000000000004 > [ 0.243916] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000002102 > [ 0.243916] RBP: 0000000000000000 R08: 0000000000000000 R09: c0000000ffffdfff > [ 0.243916] R10: 0000000000000001 R11: ffffffffae803ac0 R12: 0000000000000000 > [ 0.243916] R13: 0000000000002102 R14: 0000000000000001 R15: ffffa0938000d000 > [ 0.243916] FS: 0000000000000000(0000) GS:ffff893ab7c00000(0000) knlGS:0000000000000000 > [ 0.243916] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 0.243916] CR2: 0000000000001cc8 CR3: 0000000176e10000 CR4: 00000000000006b0 > [ 0.243916] Call Trace: > [ 0.243916] __alloc_pages_bulk+0xaa1/0xb50 > <snip> > > (gdb) l *__alloc_pages+0x11e > 0xffffffff8129d87e is in __alloc_pages (./include/linux/mmzone.h:1095). > 1090 return zoneref->zone; > 1091 } > 1092 > 1093 static inline int zonelist_zone_idx(struct zoneref *zoneref) > 1094 { > 1095 return zoneref->zone_idx; > 1096 } > 1097 > 1098 static inline int zonelist_node_idx(struct zoneref *zoneref) > 1099 { > (gdb) > > Seems like "zoneref" refers to invalid address. > > Thoughts? I have not previously read the patch but there are a few concerns and it's probably just as well this blew up early. The bulk allocator assumes a valid node but the patch can send in NUMA_NO_NODE (-1). On the high-order path alloc_pages_node is used which checks nid == NUMA_NO_NODE. Also, area->pages is not necessarily initialised so that could be interpreted as a partially populated array so minmally you need. diff --git a/mm/vmalloc.c b/mm/vmalloc.c index f3c5dd4ccc5b9..b638ff31b07e1 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2792,6 +2792,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, page_order = vm_area_page_order(area); if (!page_order) { + memset(area->pages, 0, array_size); + if (node == NUMA_NO_NODE) + node = numa_mem_id(); area->nr_pages = __alloc_pages_bulk(gfp_mask, node, NULL, nr_small_pages, NULL, area->pages); } else { However, the high-order path also looks suspicious. area->nr_pages is advanced before the allocation attempt so in the event alloc_pages_node() returns NULL prematurely, area->nr_pages does not reflect the number of pages allocated so that needs examination. As an aside, where or what is test_vmalloc.sh? It appears to have been used a few times but it's not clear it's representative so are you aware of workloads that are vmalloc-intensive? It does not matter for the patch as such but it would be nice to know examples of vmalloc-intensive workloads because I cannot recall a time during the last few years where I saw vmalloc.c high in profiles. -- Mel Gorman SUSE Labs