On Wed 20-11-19 18:53:44, Yafang Shao wrote: > On Wed, Nov 20, 2019 at 6:22 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > On Wed 20-11-19 03:53:05, Yafang Shao wrote: > > > A task running in a memcg may OOM because of the memory.min settings of his > > > slibing and parent. If this happens, the current oom messages can't show > > > why file page cache can't be reclaimed. > > > > min limit is not the only way to protect memory from being reclaim. The > > memory might be pinned or unreclaimable for other reasons (e.g. swap > > quota exceeded for memcg). > > Both swap or unreclaimabed (unevicteable) is printed in OOM messages. Not really. Consider a memcg which has reached it's swap limit. The anonymous memory is not really reclaimable even when there is a lot of swap space available. > If something else can prevent the file cache being reclaimed, we'd > better show them as well. How are you going to do that? How do you track pins on pages? > > Besides that, there is the very same problem > > with the global OOM killer, right? And I do not expect we want to print > > all memcgs in the system (this might be hundreds). > > > > I forgot the global oom... > > Why not just print the memcgs which are under memory.min protection or > something like a total number of min protected memory ? Yes, this would likely help. But the main question really reamains, is this really worth it? > > > So it is better to show the memcg > > > min settings. > > > Let's take an example. > > > bar bar/memory.max = 1200M memory.min=800M > > > / \ > > > barA barB barA/memory.min = 800M memory.current=1G (file page cache) > > > barB/memory.min = 0 (process in this memcg is allocating page) > > > > > > The process will do memcg reclaim if the bar/memory.max is reached. Once > > > the barA/memory.min is reached it will stop reclaiming file page caches in > > > barA, and if there is no reclaimable pages in bar and bar/barB it will > > > enter memcg OOM then. > > > After this pacch, bellow messages will be show then (only includeing the > > > relevant messages here). The lines begin with '#' are newly added info (the > > > '#' symbol is not in the original messages). > > > memory: usage 1228800kB, limit 1228800kB, failcnt 18337 > > > ... > > > # Memory cgroup min setting: > > > # /bar: min 819200KB emin 0KB > > > # /bar/barA: min 819200KB emin 819200KB > > > # /bar/barB: min 0KB emin 0KB > > > ... > > > Memory cgroup stats for /bar: > > > anon 418328576 > > > file 835756032 > > > ... > > > unevictable 0 > > > ... > > > oom-kill:constraint=CONSTRAINT_MEMCG..oom_memcg=/bar,task_memcg=/bar/barB > > > > > > With the new added information, we can find the memory.min in bar/barA is > > > reached and the processes in bar/barB can't reclaim file page cache from > > > bar/barA any more. While without this new added information we don't know > > > why the file page cache in bar can't be reclaimed. > > > > Well, I am not sure this is really usefull enough TBH. It doesn't give > > you the whole picture and it potentially generates a lot of output in > > the oom report. FYI we used to have a more precise break down of > > counters in memcg hierarchy, see 58cf188ed649 ("memcg, oom: provide more > > precise dump info while memcg oom happening") which later got rewritten > > by c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup OOM") > > > > At least we'd better print a total protected memory in the oom messages. > > > Could you be more specific why do you really need this piece of > > information? > > I have said in the commit log, that we don't know why the file cache > can't be reclaimed (when evictable is 0 and dirty is 0 as well.) And the counter argument is that this will not help you there much in many large and much more common cases. I argue, and I might be wrong here so feel free to correct me, that the reclaim protection guarantee (min) is something to be under admins control. It shouldn't really happen nilly-willy because it has really large consequences, the OOM including. So if there is a suspicious amount of memory that could be reclaimed normally then the reclaim protection is really the first suspect to go after. -- Michal Hocko SUSE Labs