On 8/3/19 12:39 AM, Mike Kravetz wrote: > From: Hillf Danton <hdanton@xxxxxxxx> > > Address the issue of should_continue_reclaim continuing true too often > for __GFP_RETRY_MAYFAIL attempts when !nr_reclaimed and nr_scanned. > This could happen during hugetlb page allocation causing stalls for > minutes or hours. > > We can stop reclaiming pages if compaction reports it can make a progress. > A code reshuffle is needed to do that. > And it has side-effects, however, > with allocation latencies in other cases but that would come at the cost > of potential premature reclaim which has consequences of itself. Based on Mel's longer explanation, can we clarify the wording here? e.g.: There might be side-effect for other high-order allocations that would potentially benefit from more reclaim before compaction for them to be faster and less likely to stall, but the consequences of premature/over-reclaim are considered worse. > We can also bail out of reclaiming pages if we know that there are not > enough inactive lru pages left to satisfy the costly allocation. > > We can give up reclaiming pages too if we see dryrun occur, with the > certainty of plenty of inactive pages. IOW with dryrun detected, we are > sure we have reclaimed as many pages as we could. > > Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxxxx> > Cc: Vlastimil Babka <vbabka@xxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Signed-off-by: Hillf Danton <hdanton@xxxxxxxx> > Tested-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > Acked-by: Mel Gorman <mgorman@xxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> I will send some followup cleanup. There should be also Mike's SOB? > --- > mm/vmscan.c | 28 +++++++++++++++------------- > 1 file changed, 15 insertions(+), 13 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 47aa2158cfac..a386c5351592 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2738,18 +2738,6 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, > return false; > } > > - /* > - * If we have not reclaimed enough pages for compaction and the > - * inactive lists are large enough, continue reclaiming > - */ > - pages_for_compaction = compact_gap(sc->order); > - inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE); > - if (get_nr_swap_pages() > 0) > - inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON); > - if (sc->nr_reclaimed < pages_for_compaction && > - inactive_lru_pages > pages_for_compaction) > - return true; > - > /* If compaction would go ahead or the allocation would succeed, stop */ > for (z = 0; z <= sc->reclaim_idx; z++) { > struct zone *zone = &pgdat->node_zones[z]; > @@ -2765,7 +2753,21 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, > ; > } > } > - return true; > + > + /* > + * If we have not reclaimed enough pages for compaction and the > + * inactive lists are large enough, continue reclaiming > + */ > + pages_for_compaction = compact_gap(sc->order); > + inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE); > + if (get_nr_swap_pages() > 0) > + inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON); > + > + return inactive_lru_pages > pages_for_compaction && > + /* > + * avoid dryrun with plenty of inactive pages > + */ > + nr_scanned && nr_reclaimed; > } > > static bool pgdat_memcg_congested(pg_data_t *pgdat, struct mem_cgroup *memcg) >