Re: + mm-memcontrol-dont-throttle-dying-tasks-on-memoryhigh.patch added to mm-hotfixes-unstable branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 16, 2024 at 01:45:47PM -0800, Andrew Morton wrote:
> 
> The patch titled
>      Subject: mm: memcontrol: don't throttle dying tasks on memory.high
> has been added to the -mm mm-hotfixes-unstable branch.  Its filename is
>      mm-memcontrol-dont-throttle-dying-tasks-on-memoryhigh.patch
> 
> This patch will shortly appear at
>      https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-memcontrol-dont-throttle-dying-tasks-on-memoryhigh.patch

Hi Andrew,

there is an updated version from Johannes in the same thread.
It seems like you've picked the original version. Please, pick
the new one instead.

Thank you!


> 
> This patch will later appear in the mm-hotfixes-unstable branch at
>     git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
> 
> The -mm tree is included into linux-next via the mm-everything
> branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> and is updated there every 2-3 working days
> 
> ------------------------------------------------------
> From: Johannes Weiner <hannes@xxxxxxxxxxx>
> Subject: mm: memcontrol: don't throttle dying tasks on memory.high
> Date: Thu, 11 Jan 2024 08:29:02 -0500
> 
> While investigating hosts with high cgroup memory pressures, Tejun
> found culprit zombie tasks that had were holding on to a lot of
> memory, had SIGKILL pending, but were stuck in memory.high reclaim.
> 
> In the past, we used to always force-charge allocations from tasks
> that were exiting in order to accelerate them dying and freeing up
> their rss. This changed for memory.max in a4ebf1b6ca1e ("memcg:
> prohibit unconditional exceeding the limit of dying tasks"); it noted
> that this can cause (userspace inducable) containment failures, so it
> added a mandatory reclaim and OOM kill cycle before forcing charges.
> At the time, memory.high enforcement was handled in the userspace
> return path, which isn't reached by dying tasks, and so memory.high
> was still never enforced by dying tasks.
> 
> When c9afe31ec443 ("memcg: synchronously enforce memory.high for large
> overcharges") added synchronous reclaim for memory.high, it added
> unconditional memory.high enforcement for dying tasks as well. The
> callstack shows that this path is where the zombie is stuck in.
> 
> We need to accelerate dying tasks getting past memory.high, but we
> cannot do it quite the same way as we do for memory.max: memory.max is
> enforced strictly, and tasks aren't allowed to move past it without
> FIRST reclaiming and OOM killing if necessary. This ensures very small
> levels of excess. With memory.high, though, enforcement happens lazily
> after the charge, and OOM killing is never triggered. A lot of
> concurrent threads could have pushed, or could actively be pushing,
> the cgroup into excess. The dying task will enter reclaim on every
> allocation attempt, with little hope of restoring balance.
> 
> To fix this, skip synchronous memory.high enforcement on dying tasks
> altogether again. Update memory.high path documentation while at it.
> 
> Link: https://lkml.kernel.org/r/20240111132902.389862-1-hannes@xxxxxxxxxxx
> Fixes: c9afe31ec443 ("memcg: synchronously enforce memory.high for large overcharges")
> Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> Reported-by: Tejun Heo <tj@xxxxxxxxxx>
> Reviewed-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> Acked-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
> Acked-by: Roman Gushchin <roman.gushchin@xxxxxxxxx>
> Cc: Dan Schatzberg <schatzberg.dan@xxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxxxx>
> Cc: Muchun Song <muchun.song@xxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> ---
> 
>  mm/memcontrol.c |   24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)
> 
> --- a/mm/memcontrol.c~mm-memcontrol-dont-throttle-dying-tasks-on-memoryhigh
> +++ a/mm/memcontrol.c
> @@ -2623,8 +2623,9 @@ static unsigned long calculate_high_dela
>  }
>  
>  /*
> - * Scheduled by try_charge() to be executed from the userland return path
> - * and reclaims memory over the high limit.
> + * Reclaims memory over the high limit. Called directly from
> + * try_charge() when possible, but also scheduled to be called from
> + * the userland return path where reclaim is always able to block.
>   */
>  void mem_cgroup_handle_over_high(gfp_t gfp_mask)
>  {
> @@ -2693,6 +2694,9 @@ retry_reclaim:
>  	}
>  
>  	/*
> +	 * Reclaim didn't manage to push usage below the limit, slow
> +	 * this allocating task down.
> +	 *
>  	 * If we exit early, we're guaranteed to die (since
>  	 * schedule_timeout_killable sets TASK_KILLABLE). This means we don't
>  	 * need to account for any ill-begotten jiffies to pay them off later.
> @@ -2887,8 +2891,22 @@ done_restock:
>  		}
>  	} while ((memcg = parent_mem_cgroup(memcg)));
>  
> +	/*
> +	 * Reclaim is scheduled for the userland return path already,
> +	 * but also attempt synchronous reclaim to avoid excessive
> +	 * overrun while the task is still inside the kernel. If this
> +	 * is successful, the return path will see it when it rechecks
> +	 * the overage, and simply bail out.
> +	 *
> +	 * Skip if the task is already dying, though. Unlike
> +	 * memory.max, memory.high enforcement isn't as strict, and
> +	 * there is no OOM killer involved, which means the excess
> +	 * could already be much bigger (and still growing) than it
> +	 * could for memory.max; the dying task could get stuck in
> +	 * fruitless reclaim for a long time, which isn't desirable.
> +	 */
>  	if (current->memcg_nr_pages_over_high > MEMCG_CHARGE_BATCH &&
> -	    !(current->flags & PF_MEMALLOC) &&
> +	    !(current->flags & PF_MEMALLOC) && !task_is_dying() &&
>  	    gfpflags_allow_blocking(gfp_mask)) {
>  		mem_cgroup_handle_over_high(gfp_mask);
>  	}
> _
> 
> Patches currently in -mm which might be from hannes@xxxxxxxxxxx are
> 
> mm-memcontrol-dont-throttle-dying-tasks-on-memoryhigh.patch
> 




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux