On Tue 11-08-20 09:51:48, Baoquan He wrote: > On 08/10/20 at 05:19pm, Mike Kravetz wrote: > > On 8/9/20 7:17 PM, Baoquan He wrote: > > > On 08/07/20 at 05:12pm, Wei Yang wrote: > > >> Let's always increase surplus_huge_pages and so that free_huge_page > > >> could decrease it at free time. > > >> > > >> Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxxxxxxxxxx> > > >> --- > > >> mm/hugetlb.c | 14 ++++++-------- > > >> 1 file changed, 6 insertions(+), 8 deletions(-) > > >> > > >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > >> index 1f2010c9dd8d..a0eb81e0e4c5 100644 > > >> --- a/mm/hugetlb.c > > >> +++ b/mm/hugetlb.c > > >> @@ -1913,21 +1913,19 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask, > > >> return NULL; > > >> > > >> spin_lock(&hugetlb_lock); > > >> + > > >> + h->surplus_huge_pages++; > > >> + h->surplus_huge_pages_node[page_to_nid(page)]++; > > >> + > > >> /* > > >> * We could have raced with the pool size change. > > >> * Double check that and simply deallocate the new page > > >> - * if we would end up overcommiting the surpluses. Abuse > > >> - * temporary page to workaround the nasty free_huge_page > > >> - * codeflow > > >> + * if we would end up overcommiting the surpluses. > > >> */ > > >> - if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) { > > >> - SetPageHugeTemporary(page); > > > > > > Hmm, the temporary page way is taken intentionally in > > > commit 9980d744a0428 ("mm, hugetlb: get rid of surplus page accounting tricks"). > > > From code, this is done inside hugetlb_lock holding, and the code flow > > > is straightforward, should be safe. Adding Michal to CC. But the lock is not held during the migration, right? > > I remember when the temporary page code was added for page migration. > > The use of temporary page here was added at about the same time. Temporary > > page does have one advantage in that it will not CAUSE surplus count to > > exceed overcommit. This patch could cause surplus to exceed overcommit > > for a very short period of time. However, do note that for this to happen > > the code needs to race with a pool resize which itself could cause surplus > > to exceed overcommit. Correct. > > IMO both approaches are valid. > > - Advantage of temporary page is that it can not cause surplus to exceed > > overcommit. Disadvantage is as mentioned in the comment 'abuse of temporary > > page'. > > - Advantage of this patch is that it uses existing counters. Disadvantage > > is that it can momentarily cause surplus to exceed overcommit. Do I remember correctly that this can cause an allocation failure due to overcommit check? In other words it would be user space visible thing? > Yeah, since it's all done inside hugetlb_lock, should be OK even > though it may cause surplus to exceed overcommit. > > > > Unless someone has a strong opinion, I prefer the changes in this patch. > > Agree, I also prefer the code change in this patch, to remove the > unnecessary confusion about the temporary page. I have managed to forgot all the juicy details since I have made that change. All that remains is that the surplus pages accounting was quite tricky and back then I didn't figure out a simpler method that would achieve the consistent look at those counters. As mentioned above I suspect this could lead to pre-mature allocation failures while the migration is ongoing. Sure quite unlikely to happen and the race window is likely very small. Maybe this is even acceptable but I would strongly recommend to have all this thinking documented in the changelog. -- Michal Hocko SUSE Labs