On 2022/4/19 22:07, Kefeng Wang wrote:
On 2022/4/19 12:03, Andrew Morton wrote:
On Sat, 16 Apr 2022 10:35:26 +0000 Peng Liu <liupeng256@xxxxxxxxxx>
wrote:
Certain systems are designed to have sparse/discontiguous nodes. In
this case, nr_online_nodes can not be used to walk through numa node.
Also, a valid node may be greater than nr_online_nodes.
However, in hugetlb, it is assumed that nodes are contiguous. Recheck
all the places that use nr_online_nodes, and repair them one by one.
oops.
What are the user-visible runtime effects of this flaw?
For example, there are four node=0,1,2,3, and nid = 1 is offline
node,nr_online_nodes = 3
1) per-node alloc (hugepages=1:2) fails,
2) per-node alloc (hugepages=3:2) fails, but it could succeed.
I assume that there is no user-visible runtime effects.
Thanks, you are right.
I have constructed node =0, 1, 3, 4, and requested huge pages as:
hugepagesz=1G hugepages=0:1,4:1 hugepagesz=2M hugepages=0:1024,4:1024
Without this patch:
HugeTLB: Invalid hugepages parameter 4:1
HugeTLB: Invalid hugepages parameter 4:1024
HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
HugeTLB registered 2.00 MiB page size, pre-allocated 1024 pages
With this patch:
HugeTLB registered 1.00 GiB page size, pre-allocated 2 pages
HugeTLB registered 2.00 MiB page size, pre-allocated 2048 pages
.
.