Mike Kravetz <mike.kravetz@xxxxxxxxxx> 于2021年11月29日周一 下午12:31写道: > > On 11/28/21 03:18, Maxim Levitsky wrote: > > > > dmesg prints this: > > > > HugeTLB: allocating 64 of page size 1.00 GiB failed. Only allocated 0 hugepages > > > > Huge pages were allocated on kernel command line (1/2 of 128GB system): > > > > 'default_hugepagesz=1G hugepagesz=1G hugepages=64' > > > > This is 3970X and no real support/need for NUMA, thus only fake NUMA node 0 is present. > > > > Reverting the commit helps. > > > > New syntax also works ( hugepages=0:64 ) > > > > I can test any patches for this bug. > > Argh! I think preallocation of gigantic pages on all systems with only > a single node is broken. The issue is at the beginning of > __alloc_bootmem_huge_page: > > int __alloc_bootmem_huge_page(struct hstate *h, int nid) > { > struct huge_bootmem_page *m = NULL; /* initialize for clang */ > int nr_nodes, node; > > if (nid >= nr_online_nodes) > return 0; > > Without using the node specific syntax, nid == NUMA_NO_NODE == -1. For the > comparison, nid will be converted to an unsigned into to match nr_online_nodes > so we will immediately return 0 instead of doing the allocations. > > Zhenguo Yao, > Can you verify and perhaps put together a patch?does > Preallocation of gigantic pages cant‘ work in all the environment, not only in single node. I think the issue is because of the replacement nodes_weight(node_states[N_MEMORY] with nr_online_nodes in my patch of last version. Sorry for my careless. I didn't notice that parameter nid is int ,but nr_online_nodes is unsigned int. so, this if (nid >= nr_online_nodes) is always true when nid is NUMA_NO_NODE(-1). I will send a fix as soon as passible. This is really a low-level mistake ^^ > > > > Also unrelated, is there any progress on allocating 1GB pages on demand so that I could > > allocate them only when I run a VM? > > That should be possible. Such support was added back in 2014 with commit > 944d9fec8d7a "hugetlb: add support for gigantic page allocation at runtime". > > > > > i don't mind having these pages to be marked as to be used for userspace only, > > since as far as I remember its the kernel usage that makes some page unmoveable. > > > > Of course, finding 1GB of contiguous space for a gigantic page is often > difficult at runtime. So, allocations are likely to fail the longer the > system is up and running and fragmentation increases. > > > Last time (many years ago) I tried to create a zone with only userspace pages > > (I don't remember what options I used) but it didn't work. > > Not too long ago, support was added to use CMA for gigantic page allocation. > See commit cf11e85fc08c "mm: hugetlb: optionally allocate gigantic hugepages > using cma". This sounds like something you might want to try. > -- > Mike Kravetz > > > > > Is there a way to debug what is causing unmoveable pages and doesn't let > > /proc/sys/vm/nr_hugepages work (I tried it today and as usual the number > > it can allocate steadly decreases over time). > > > >