Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michal,

On Tue, May 10, 2022 at 07:43:41PM +0200, Michal Koutný wrote:
> On Mon, May 09, 2022 at 05:44:24PM -0700, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > So I think we're OK with [2/5] now.  Unless there be objections, I'll
> > be looking to get this series into mm-stable later this week.
> 
> I'm sorry, I think the current form of the test reveals an unexpected
> behavior of reclaim and silencing the test is not the way to go.
> Although, I may be convinced that my understanding is wrong.

Looking through your demo results again, I agree with you. It's a tiny
error, but it compounds and systematically robs the protected group
over and over, to the point where its protection becomes worthless -
at least in idle groups, which isn't super common but does happen.

Let's keep the test as-is and fix reclaim to make it pass ;)

> The obvious fix is at the end of this message, it resolves the case I
> posted earlier (with memory_recursiveprot), however, it "breaks"
> memory.events:low accounting inside recursive children, hence I'm not
> considering it finished. (I may elaborate on the breaking case if
> interested, I also need to look more into that myself).

Can you indeed elaborate on the problem you see with low events?

> @@ -2798,13 +2798,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
>  
>  			scan = lruvec_size - lruvec_size * protection /
>  				(cgroup_size + 1);
> -
> -			/*
> -			 * Minimally target SWAP_CLUSTER_MAX pages to keep
> -			 * reclaim moving forwards, avoiding decrementing
> -			 * sc->priority further than desirable.
> -			 */
> -			scan = max(scan, SWAP_CLUSTER_MAX);

IIRC this was added due to premature OOMs in synthetic testing (Chris
may remember more details).

However, in practice it wasn't enough anyway, and was followed up by
f56ce412a59d ("mm: memcontrol: fix occasional OOMs due to proportional
memory.low reclaim"). Now, reclaim retries the whole cycle if
proportional protection was in place and it didn't manage to make
progress. The rounding for progress doesn't seem to matter anymore.

So your proposed patch looks like the right thing to do to me. And I
would ack it, but please do explain your concerns around low event
reporting after it.

Thanks!



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux