Re: [PATCH v4 0/3] mm: improvements in shrink slab

Yafang Shao <laoar.shao@xxxxxxxxx> · Thu, 6 Jun 2019 22:18:41 +0800

On Thu, Jun 6, 2019 at 7:17 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
On Thu 06-06-19 18:14:37, Yafang Shao wrote:

> In the past few days, I found an issue in shrink slab.

> We I was trying to fix it, I find there are something in shrink slab need

> to be improved.

> 

> - #1 is to expose the min_slab_pages to help us analyze shrink slab.

> 

> - #2 is an code improvement.

> 

> - #3 is a fix to a issue. This issue is very easy to produce.

> In the zone reclaim mode.

> First you continuously cat a random non-exist file to produce

> more and more dentry, then you read big file to produce page cache.

> Finally you will find that the denty will never be shrunk.

> In order to fix this issue, a new bitmask no_pagecache is introduce,

> which is 0 by defalt.

Node reclaim mode is quite special and rarely used these days. Could you

be more specific on how did you get to see the above problems? Do you

really need node reclaim in your usecases or is this more about a

testing and seeing what happens. Not that I am against these changes but

I would like to understand the motivation. Especially because you are

exposing some internal implementation details of the node reclaim to the

userspace.

The slab issue we found on our server is on old kernel (kernel-3.10).
We found that the dentry was continuesly growing without shrinking in one container on a server,
so I read slab code and found that memcg relcaim can't shrink slab in this old kenrel, 
but this issue was aready fixed in upstream.

When I was reading the shrink slab code in the upstream kernel,
I found the slab can't be shrinked in node reclaim.
So I did some test to produce this issue and post this patchset to fix it.
With my patch, the issue produced by me disapears. 

But this is only a beginning in the node reclaim path...
Then I found another issue when I implemented a memory pressure monitor for out containers,
which is vmpressure_prio() is missed in the node reclaim path.

Well, seems when we introduce new feature for page relciam, we always ignore the node reclaim path.

Regarding node reclaim path, we always turn it off on our servers,
because we really found some latency spike caused by node reclaim
(the reason why node reclaim is turned on is not clear).

The reason I expose node reclaim details to userspace is because the user can set node reclaim details now.

Thanks
Yafang