On Fri, Jul 26, 2019 at 3:09 PM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, Jul 25, 2019 at 09:50:21AM -0400, Yafang Shao wrote: > > By now there're three compaction paths, > > - direct compaction > > - kcompactd compcation > > - proc triggered compaction > > When we do compaction in all these paths, we will use compaction_suitable() > > to check whether a zone is suitable to do compaction. > > > > There're some issues around the usage of compaction_suitable(). > > We don't use the proper zoneid in kcompactd_node_suitable() when try to > > wakeup kcompactd. In the kcompactd compaction paths, we call > > compaction_suitable() twice and the zoneid isn't proper in the second call. > > For proc triggered compaction, the classzone_idx is always zero. > > > > In order to fix these issues, I change the type of classzone_idx in the > > struct compact_control from const int to int and assign the proper zoneid > > before calling compact_zone(). > > > > What is actually fixed by this? > Recently there's a page alloc failure on our server because the compaction can't satisfy it. This issue is unproducible, so I have to view the compaction code and find out the possible solutions. When I'm reading these compaction code, I find some misuse of compaction_suitable(). But after you point out, I find that I missed something. The classzone_idx should represent the alloc request, otherwise we may do unnecessary compaction on a zone. Thanks a lot for your explaination. Hi Andrew, Pls. help drop this patch. Sorry about that. I will think about it more. > > This patch also fixes some comments in struct compact_control, as these > > fields are not only for direct compactor but also for all other compactors. > > > > Fixes: ebff398017c6("mm, compaction: pass classzone_idx and alloc_flags to watermark checking") > > Fixes: 698b1b30642f("mm, compaction: introduce kcompactd") > > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > > Cc: Vlastimil Babka <vbabka@xxxxxxx> > > Cc: Arnd Bergmann <arnd@xxxxxxxx> > > Cc: Paul Gortmaker <paul.gortmaker@xxxxxxxxxxxxx> > > Cc: Rik van Riel <riel@xxxxxxxxxx> > > Cc: Yafang Shao <shaoyafang@xxxxxxxxxxxxxx> > > --- > > mm/compaction.c | 12 +++++------- > > mm/internal.h | 10 +++++----- > > 2 files changed, 10 insertions(+), 12 deletions(-) > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > index ac4ead0..984dea7 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -2425,6 +2425,7 @@ static void compact_node(int nid) > > continue; > > > > cc.zone = zone; > > + cc.classzone_idx = zoneid; > > > > compact_zone(&cc, NULL); > > > > @@ -2508,7 +2509,7 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat) > > continue; > > > > if (compaction_suitable(zone, pgdat->kcompactd_max_order, 0, > > - classzone_idx) == COMPACT_CONTINUE) > > + zoneid) == COMPACT_CONTINUE) > > return true; > > } > > > > This is a semantic change. The use of the classzone_idx here and not > classzone_idx is so that the watermark check takes the lowmem reserves > into account in the __zone_watermark_ok check. This means that > compaction is more likely to proceed but not necessarily correct. > > > @@ -2526,7 +2527,6 @@ static void kcompactd_do_work(pg_data_t *pgdat) > > struct compact_control cc = { > > .order = pgdat->kcompactd_max_order, > > .search_order = pgdat->kcompactd_max_order, > > - .classzone_idx = pgdat->kcompactd_classzone_idx, > > .mode = MIGRATE_SYNC_LIGHT, > > .ignore_skip_hint = false, > > .gfp_mask = GFP_KERNEL, > > @@ -2535,7 +2535,7 @@ static void kcompactd_do_work(pg_data_t *pgdat) > > cc.classzone_idx); > > count_compact_event(KCOMPACTD_WAKE); > > > > - for (zoneid = 0; zoneid <= cc.classzone_idx; zoneid++) { > > + for (zoneid = 0; zoneid <= pgdat->kcompactd_classzone_idx; zoneid++) { > > int status; > > > > zone = &pgdat->node_zones[zoneid]; > > This variable can be updated by a wakeup while the loop is executing > making the loop more difficult to reason about given the exit conditions > can change. > Thanks for your point out. But seems there're still issues event without my change ? For example, If we call wakeup_kcompactd() while the kcompactd is running, we just modify the kcompactd_max_order and kcompactd_classzone_idx and then return. Then in another path, the wakeup_kcompactd() is called again, so kcompactd_classzone_idx and kcompactd_max_order will be override, that means the previous wakeup is missed. Right ? > Please explain what exactly this patch is fixing and why it should be > done because it currently appears to be making a number of subtle > changes without justification. > > -- > Mel Gorman > SUSE Labs