Johannes Weiner <hannes@xxxxxxxxxxx> writes: > On Wed, Apr 26, 2023 at 09:30:23AM +0800, Huang, Ying wrote: >> Johannes Weiner <hannes@xxxxxxxxxxx> writes: >> >> > On Tue, Apr 25, 2023 at 11:12:28AM +0800, Huang, Ying wrote: >> >> Johannes Weiner <hannes@xxxxxxxxxxx> writes: >> >> >> >> > Kswapd currently bails on higher-order allocations with an open-coded >> >> > check for whether it's reclaimed the compaction gap. >> >> > >> >> > compaction_suitable() is the customary interface to coordinate reclaim >> >> > with compaction. >> >> > >> >> > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx> >> >> > --- >> >> > mm/vmscan.c | 67 ++++++++++++++++++----------------------------------- >> >> > 1 file changed, 23 insertions(+), 44 deletions(-) >> >> > >> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c >> >> > index ee8c8ca2e7b5..723705b9e4d9 100644 >> >> > --- a/mm/vmscan.c >> >> > +++ b/mm/vmscan.c >> >> > @@ -6872,12 +6872,18 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx) >> >> > if (!managed_zone(zone)) >> >> > continue; >> >> > >> >> > + /* Allocation can succeed in any zone, done */ >> >> > if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) >> >> > mark = wmark_pages(zone, WMARK_PROMO); >> >> > else >> >> > mark = high_wmark_pages(zone); >> >> > if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx)) >> >> > return true; >> >> > + >> >> > + /* Allocation can't succeed, but enough order-0 to compact */ >> >> > + if (compaction_suitable(zone, order, >> >> > + highest_zoneidx) == COMPACT_CONTINUE) >> >> > + return true; >> >> >> >> Should we check the following first? >> >> >> >> order > 0 && zone_watermark_ok_safe(zone, 0, mark, highest_zoneidx) >> > >> > That's what compaction_suitable() does. It checks whether there are >> > enough migration targets for compaction (COMPACT_CONTINUE) or whether >> > reclaim needs to do some more work (COMPACT_SKIPPED). >> >> Yes. And I found that the watermark used in compaction_suitable() is >> low_wmark_pages() or min_wmark_pages(), which doesn't match the >> watermark here. Or did I miss something? > > Ahh, you're right, kswapd will bail prematurely. Compaction cannot > reliably meet the high watermark with a low watermark scratch space. > > I'll add the order check before the suitable test, for clarity, and so > that order-0 requests don't check the same thing twice. > > For the watermark, I'd make it an arg to compaction_suitable() and use > whatever the reclaimer targets (high for kswapd, min for direct). > > However, there is a minor snag. compaction_suitable() currently has > its own smarts regarding the watermark: > > /* > * Watermarks for order-0 must be met for compaction to be able to > * isolate free pages for migration targets. This means that the > * watermark and alloc_flags have to match, or be more pessimistic than > * the check in __isolate_free_page(). We don't use the direct > * compactor's alloc_flags, as they are not relevant for freepage > * isolation. We however do use the direct compactor's highest_zoneidx > * to skip over zones where lowmem reserves would prevent allocation > * even if compaction succeeds. > * For costly orders, we require low watermark instead of min for > * compaction to proceed to increase its chances. > * ALLOC_CMA is used, as pages in CMA pageblocks are considered > * suitable migration targets > */ > watermark = (order > PAGE_ALLOC_COSTLY_ORDER) ? > low_wmark_pages(zone) : min_wmark_pages(zone); > > Historically it has always checked low instead of min. Then Vlastimil > changed it to min for non-costly orders here: > > commit 8348faf91f56371d4bada6fc5915e19580a15ffe > Author: Vlastimil Babka <vbabka@xxxxxxx> > Date: Fri Oct 7 16:58:00 2016 -0700 > > mm, compaction: require only min watermarks for non-costly orders > > The __compaction_suitable() function checks the low watermark plus a > compact_gap() gap to decide if there's enough free memory to perform > compaction. Then __isolate_free_page uses low watermark check to decide > if particular free page can be isolated. In the latter case, using low > watermark is needlessly pessimistic, as the free page isolations are > only temporary. For __compaction_suitable() the higher watermark makes > sense for high-order allocations where more freepages increase the > chance of success, and we can typically fail with some order-0 fallback > when the system is struggling to reach that watermark. But for > low-order allocation, forming the page should not be that hard. So > using low watermark here might just prevent compaction from even trying, > and eventually lead to OOM killer even if we are above min watermarks. > > So after this patch, we use min watermark for non-costly orders in > __compaction_suitable(), and for all orders in __isolate_free_page(). > > Lowering to min wasn't an issue for non-costly, but AFAICS there was > no explicit testing for whether min would work for costly orders too. > > I'd propose trying it with min even for costly and see what happens. > > If it does regress, a better place to boost scratch space for costly > orders might be compact_gap(), so I'd move it there. > > Does that sound reasonable? Sounds good to me, Thanks! Best Regards, Huang, Ying