Re: [PATCH] mm/vmscan.c: no need to double-check if free pages are under high-watermark

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 02, 2022 at 12:31:29PM +0900, skseofh@xxxxxxxxx wrote:
> From: Daero Lee <skseofh@xxxxxxxxx>
> 
> In kswapd_try_to_sleep function, to check whether kswapd can sleep,
> the prepare_kswapd_sleep function is called twice.
> 
> If free pages are below high-watermark in the first call,
> the @remaining variable is not updated at 0 and the
> prepare_kswapd_sleep function is called for the second time.
> 
> I think it is necessary to set the initial value of the
> @remaining to a non-zero value to prevent consecutive calls
> to the same function.
> 
> Signed-off-by: Daero Lee <skseofh@xxxxxxxxx>
> ---
>  mm/vmscan.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 700434db5735..1217ecec5bbb 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -4331,7 +4331,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx)
>  	/*
>  	 * Return the order kswapd stopped reclaiming at as
>  	 * prepare_kswapd_sleep() takes it into account. If another caller
> -	 * entered the allocator slow path while kswapd was awake, order will
> +	 * entered the allqocator slow path while kswapd was awake, order will
>  	 * remain at the higher level.
>  	 */
>  	return sc.order;

This hunk just adds a typo, drop it.

> @@ -4355,7 +4355,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat,
>  static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order,
>  				unsigned int highest_zoneidx)
>  {
> -	long remaining = 0;
> +	long remaining = ~0;
>  	DEFINE_WAIT(wait);
>  
>  	if (freezing(current) || kthread_should_stop())

While this does avoid calling prepare_kswapd_sleep() twice if the pgdat
is balanced on the first try, it then does not restore the vmstat
thresholds and doesn't call schedul() for kswapd to go to sleep.

I think you did spot a problem but I suspect you want something like
the following untested patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 700434db5735..40784693c840 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4355,7 +4355,8 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat,
 static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order,
 				unsigned int highest_zoneidx)
 {
-	long remaining = 0;
+	long remaining;
+	bool balanced;
 	DEFINE_WAIT(wait);
 
 	if (freezing(current) || kthread_should_stop())
@@ -4370,7 +4371,8 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o
 	 * eligible zone balanced that it's also unlikely that compaction will
 	 * succeed.
 	 */
-	if (prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) {
+	balanced = prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx);
+	if (balanced) {
 		/*
 		 * Compaction records what page blocks it recently failed to
 		 * isolate pages from and skips them in the future scanning.
@@ -4387,6 +4389,10 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o
 
 		remaining = schedule_timeout(HZ/10);
 
+		/* Is pgdat balanced after a short sleep? */
+		balanced = prepare_kswapd_sleep(pgdat, reclaim_order,
+							highest_zoneidx);
+
 		/*
 		 * If woken prematurely then reset kswapd_highest_zoneidx and
 		 * order. The values will either be from a wakeup request or
@@ -4406,11 +4412,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o
 	}
 
 	/*
-	 * After a short sleep, check if it was a premature sleep. If not, then
-	 * go fully to sleep until explicitly woken up.
+	 * If balanced to the high watermark, restore vmstat thresholds and
+	 * kswapd goes to sleep. If kswapd remains awake, account whether
+	 * the low or high watermark was hit quickly.
 	 */
-	if (!remaining &&
-	    prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) {
+	if (balanced) {
 		trace_mm_vmscan_kswapd_sleep(pgdat->node_id);
 
 		/*




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux