Re: [PATCH] mm/compaction: use proper zoneid for compaction_suitable()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 26, 2019 at 3:09 PM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Jul 25, 2019 at 09:50:21AM -0400, Yafang Shao wrote:
> > By now there're three compaction paths,
> > - direct compaction
> > - kcompactd compcation
> > - proc triggered compaction
> > When we do compaction in all these paths, we will use compaction_suitable()
> > to check whether a zone is suitable to do compaction.
> >
> > There're some issues around the usage of compaction_suitable().
> > We don't use the proper zoneid in kcompactd_node_suitable() when try to
> > wakeup kcompactd. In the kcompactd compaction paths, we call
> > compaction_suitable() twice and the zoneid isn't proper in the second call.
> > For proc triggered compaction, the classzone_idx is always zero.
> >
> > In order to fix these issues, I change the type of classzone_idx in the
> > struct compact_control from const int to int and assign the proper zoneid
> > before calling compact_zone().
> >
>
> What is actually fixed by this?
>

Recently there's a page alloc failure on our server because the
compaction can't satisfy it.
This issue is unproducible, so I have to view the compaction code and
find out the possible solutions.
When I'm reading these compaction code, I find some  misuse of
compaction_suitable().
But after you point out, I find that I missed something.
The classzone_idx should represent the alloc request, otherwise we may
do unnecessary compaction on a zone.
Thanks a lot for your explaination.

Hi Andrew,

Pls. help drop this patch. Sorry about that.
I will think about it more.

> > This patch also fixes some comments in struct compact_control, as these
> > fields are not only for direct compactor but also for all other compactors.
> >
> > Fixes: ebff398017c6("mm, compaction: pass classzone_idx and alloc_flags to watermark checking")
> > Fixes: 698b1b30642f("mm, compaction: introduce kcompactd")
> > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> > Cc: Vlastimil Babka <vbabka@xxxxxxx>
> > Cc: Arnd Bergmann <arnd@xxxxxxxx>
> > Cc: Paul Gortmaker <paul.gortmaker@xxxxxxxxxxxxx>
> > Cc: Rik van Riel <riel@xxxxxxxxxx>
> > Cc: Yafang Shao <shaoyafang@xxxxxxxxxxxxxx>
> > ---
> >  mm/compaction.c | 12 +++++-------
> >  mm/internal.h   | 10 +++++-----
> >  2 files changed, 10 insertions(+), 12 deletions(-)
> >
> > diff --git a/mm/compaction.c b/mm/compaction.c
> > index ac4ead0..984dea7 100644
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -2425,6 +2425,7 @@ static void compact_node(int nid)
> >                       continue;
> >
> >               cc.zone = zone;
> > +             cc.classzone_idx = zoneid;
> >
> >               compact_zone(&cc, NULL);
> >
> > @@ -2508,7 +2509,7 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat)
> >                       continue;
> >
> >               if (compaction_suitable(zone, pgdat->kcompactd_max_order, 0,
> > -                                     classzone_idx) == COMPACT_CONTINUE)
> > +                                     zoneid) == COMPACT_CONTINUE)
> >                       return true;
> >       }
> >
>
> This is a semantic change. The use of the classzone_idx here and not
> classzone_idx is so that the watermark check takes the lowmem reserves
> into account in the __zone_watermark_ok check. This means that
> compaction is more likely to proceed but not necessarily correct.
>
> > @@ -2526,7 +2527,6 @@ static void kcompactd_do_work(pg_data_t *pgdat)
> >       struct compact_control cc = {
> >               .order = pgdat->kcompactd_max_order,
> >               .search_order = pgdat->kcompactd_max_order,
> > -             .classzone_idx = pgdat->kcompactd_classzone_idx,
> >               .mode = MIGRATE_SYNC_LIGHT,
> >               .ignore_skip_hint = false,
> >               .gfp_mask = GFP_KERNEL,
> > @@ -2535,7 +2535,7 @@ static void kcompactd_do_work(pg_data_t *pgdat)
> >                                                       cc.classzone_idx);
> >       count_compact_event(KCOMPACTD_WAKE);
> >
> > -     for (zoneid = 0; zoneid <= cc.classzone_idx; zoneid++) {
> > +     for (zoneid = 0; zoneid <= pgdat->kcompactd_classzone_idx; zoneid++) {
> >               int status;
> >
> >               zone = &pgdat->node_zones[zoneid];
>
> This variable can be updated by a wakeup while the loop is executing
> making the loop more difficult to reason about given the exit conditions
> can change.
>

Thanks for your point out.

But seems there're still issues event without my change ?
For example,
If we call wakeup_kcompactd() while the kcompactd is running,
we just modify the kcompactd_max_order and kcompactd_classzone_idx and
then return.
Then in another path, the wakeup_kcompactd() is called again,
so kcompactd_classzone_idx and kcompactd_max_order will be override,
that means the previous wakeup is missed.
Right ?


> Please explain what exactly this patch is fixing and why it should be
> done because it currently appears to be making a number of subtle
> changes without justification.
>
> --
> Mel Gorman
> SUSE Labs




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux