On 8/5/19 3:57 AM, Vlastimil Babka wrote: > On 8/5/19 10:42 AM, Vlastimil Babka wrote: >> On 8/3/19 12:39 AM, Mike Kravetz wrote: >>> From: Hillf Danton <hdanton@xxxxxxxx> >>> >>> Address the issue of should_continue_reclaim continuing true too often >>> for __GFP_RETRY_MAYFAIL attempts when !nr_reclaimed and nr_scanned. >>> This could happen during hugetlb page allocation causing stalls for >>> minutes or hours. >>> >>> We can stop reclaiming pages if compaction reports it can make a progress. >>> A code reshuffle is needed to do that. >> >>> And it has side-effects, however, >>> with allocation latencies in other cases but that would come at the cost >>> of potential premature reclaim which has consequences of itself. >> >> Based on Mel's longer explanation, can we clarify the wording here? e.g.: >> >> There might be side-effect for other high-order allocations that would >> potentially benefit from more reclaim before compaction for them to be >> faster and less likely to stall, but the consequences of >> premature/over-reclaim are considered worse. >> >>> We can also bail out of reclaiming pages if we know that there are not >>> enough inactive lru pages left to satisfy the costly allocation. >>> >>> We can give up reclaiming pages too if we see dryrun occur, with the >>> certainty of plenty of inactive pages. IOW with dryrun detected, we are >>> sure we have reclaimed as many pages as we could. >>> >>> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>> Cc: Mel Gorman <mgorman@xxxxxxx> >>> Cc: Michal Hocko <mhocko@xxxxxxxxxx> >>> Cc: Vlastimil Babka <vbabka@xxxxxxx> >>> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> >>> Signed-off-by: Hillf Danton <hdanton@xxxxxxxx> >>> Tested-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>> Acked-by: Mel Gorman <mgorman@xxxxxxx> >> >> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> >> I will send some followup cleanup. > > How about this? > ----8<---- > From 0040b32462587171ad22395a56699cc036ad483f Mon Sep 17 00:00:00 2001 > From: Vlastimil Babka <vbabka@xxxxxxx> > Date: Mon, 5 Aug 2019 12:49:40 +0200 > Subject: [PATCH] mm, reclaim: cleanup should_continue_reclaim() > > After commit "mm, reclaim: make should_continue_reclaim perform dryrun > detection", closer look at the function shows, that nr_reclaimed == 0 means > the function will always return false. And since non-zero nr_reclaimed implies > non_zero nr_scanned, testing nr_scanned serves no purpose, and so does the > testing for __GFP_RETRY_MAYFAIL. > > This patch thus cleans up the function to test only !nr_reclaimed upfront, and > remove the __GFP_RETRY_MAYFAIL test and nr_scanned parameter completely. > Comment is also updated, explaining that approximating "full LRU list has been > scanned" with nr_scanned == 0 didn't really work. > > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> Acked-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Would you like me to add this to the series, or do you want to send later? -- Mike Kravetz