On 08/29/23 11:33, Xueshi Hu wrote: > In set_nr_huge_pages(), local variable "count" is used to record > persistent_huge_pages(), but when it cames to nodes huge page allocation, > the semantics changes to nr_huge_pages. When there exists surplus huge > pages and using the interface under > /sys/devices/system/node/node*/hugepages to change huge page pool size, > this difference can result in the allocation of an unexpected number of > huge pages. > > Steps to reproduce the bug: > > Starting with: > > Node 0 Node 1 Total > HugePages_Total 0.00 0.00 0.00 > HugePages_Free 0.00 0.00 0.00 > HugePages_Surp 0.00 0.00 0.00 > > create 100 huge pages in Node 0 and consume it, then set Node 0 's > nr_hugepages to 0. > > yields: > > Node 0 Node 1 Total > HugePages_Total 200.00 0.00 200.00 > HugePages_Free 0.00 0.00 0.00 > HugePages_Surp 200.00 0.00 200.00 > > write 100 to Node 1's nr_hugepages > > echo 100 > /sys/devices/system/node/node1/\ > hugepages/hugepages-2048kB/nr_hugepages > > gets: > > Node 0 Node 1 Total > HugePages_Total 200.00 400.00 600.00 > HugePages_Free 0.00 400.00 400.00 > HugePages_Surp 200.00 0.00 200.00 > > Kernel is expected to create only 100 huge pages and it gives 200. > > Fixes: 9a30523066cd ("hugetlb: add per node hstate attributes") > Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > Signed-off-by: Xueshi Hu <xueshi.hu@xxxxxxxxxx> > --- > Change in v2: > - Correct the fix tag Thank you! -- Mike Kravetz > - v1: https://lore.kernel.org/linux-mm/20230828233448.GF3290@monkey/T/#t > > mm/hugetlb.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 6da626bfb52e..54e2e2e12aa9 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -3494,7 +3494,9 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, > if (nid != NUMA_NO_NODE) { > unsigned long old_count = count; > > - count += h->nr_huge_pages - h->nr_huge_pages_node[nid]; > + count += persistent_huge_pages(h) - > + (h->nr_huge_pages_node[nid] - > + h->surplus_huge_pages_node[nid]); > /* > * User may have specified a large count value which caused the > * above calculation to overflow. In this case, they wanted > -- > 2.40.1 >