On Wed, Nov 24, 2021 at 07:54:49PM +0900, Alexey Avramov wrote: > > it does eventually get killed OOM > > However, a full minute freeze can be a great evil in many situations - > during such a freeze, the system is completely unresponsive. > > So my next question is: How reasonable is the value MAX_RECLAIM_RETRIES? > Is it also get "out of thin air"? > The value is out of thin air but adjusting it may reintroduce issues with kswapd running at 100% CPU. > And would it make sense to have buttons to adjust the timeouts? I don't think we should introduce a tunable for something like this, it'll be impossible to use properly but can you test this? diff --git a/mm/vmscan.c b/mm/vmscan.c index 07db03883062..aa72c0f39dcc 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1058,6 +1058,14 @@ void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason) break; case VMSCAN_THROTTLE_NOPROGRESS: timeout = HZ/2; + + /* + * If kswapd is disabled, use the minimum timeout as the + * system may be at or near OOM. + */ + if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES) + timeout = 1; + break; case VMSCAN_THROTTLE_ISOLATED: timeout = HZ/50; @@ -3395,7 +3403,7 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc) return; /* Throttle if making no progress at high prioities. */ - if (sc->priority < DEF_PRIORITY - 2) + if (sc->priority < DEF_PRIORITY - 2 && !sc->nr_reclaimed) reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS); } @@ -3415,6 +3423,7 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) unsigned long nr_soft_scanned; gfp_t orig_mask; pg_data_t *last_pgdat = NULL; + pg_data_t *first_pgdat = NULL; /* * If the number of buffer_heads in the machine exceeds the maximum @@ -3478,14 +3487,18 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) /* need some check for avoid more shrink_zone() */ } + if (!first_pgdat) + first_pgdat = zone->zone_pgdat; + /* See comment about same check for global reclaim above */ if (zone->zone_pgdat == last_pgdat) continue; last_pgdat = zone->zone_pgdat; shrink_node(zone->zone_pgdat, sc); - consider_reclaim_throttle(zone->zone_pgdat, sc); } + consider_reclaim_throttle(first_pgdat, sc); + /* * Restore to original mask to avoid the impact on the caller if we * promoted it to __GFP_HIGHMEM.