On Wed, Feb 12, 2020 at 08:25:45PM +0800, Yafang Shao wrote: > On Wed, Feb 12, 2020 at 1:55 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > Another variant of this problem was recently observed, where the > > kernel violates cgroups' memory.low protection settings and reclaims > > page cache way beyond the configured thresholds. It was followed by a > > proposal of a modified form of the reverted commit above, that > > implements memory.low-sensitive shrinker skipping over populated > > inodes on the LRU [1]. However, this proposal continues to run the > > risk of attracting disproportionate reclaim pressure to a pool of > > still-used inodes, > > Hi Johannes, > > If you really think that is a risk, what about bellow additional patch > to fix this risk ? > > diff --git a/fs/inode.c b/fs/inode.c > index 80dddbc..61862d9 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -760,7 +760,7 @@ static bool memcg_can_reclaim_inode(struct inode *inode, > goto out; > > cgroup_size = mem_cgroup_size(memcg); > - if (inode->i_data.nrpages + protection >= cgroup_size) > + if (inode->i_data.nrpages) > reclaimable = false; > > out: > > With this additional patch, we skip all inodes in this memcg until all > its page cache pages are reclaimed. Well that's something we've tried and had to revert because it caused issues in slab reclaim. See the History part of my changelog. > > while not addressing the more generic reclaim > > inversion problem outside of a very specific cgroup application. > > > > But I have a different understanding. This method works like a > knob. If you really care about your workingset (data), you should > turn it on (i.e. by using memcg protection to protect them), while > if you don't care about your workingset (data) then you'd better > turn it off. That would be more flexible. Regaring your case in the > commit log, why not protect your linux git tree with memcg > protection ? I can't imagine a scenario where I *wouldn't* care about my workingset, though. Why should it be opt-in, not the default?