On Fri, Nov 22, 2019 at 6:28 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Wed 20-11-19 20:23:54, Yafang Shao wrote: > > On Wed, Nov 20, 2019 at 7:40 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > On Wed 20-11-19 18:53:44, Yafang Shao wrote: > > > > On Wed, Nov 20, 2019 at 6:22 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > > > > > On Wed 20-11-19 03:53:05, Yafang Shao wrote: > > > > > > A task running in a memcg may OOM because of the memory.min settings of his > > > > > > slibing and parent. If this happens, the current oom messages can't show > > > > > > why file page cache can't be reclaimed. > > > > > > > > > > min limit is not the only way to protect memory from being reclaim. The > > > > > memory might be pinned or unreclaimable for other reasons (e.g. swap > > > > > quota exceeded for memcg). > > > > > > > > Both swap or unreclaimabed (unevicteable) is printed in OOM messages. > > > > > > Not really. Consider a memcg which has reached it's swap limit. The > > > anonymous memory is not really reclaimable even when there is a lot of > > > swap space available. > > > > > > > The memcg swap limit is already printed in oom messages, see bellow, > > > > [ 141.721625] memory: usage 1228800kB, limit 1228800kB, failcnt 18337 > > [ 141.721958] swap: usage 0kB, limit 9007199254740988kB, failcnt 0 > > But you do not have any insight on the swap limit down the oom > hierarchy, do you? > > > > > Why not just print the memcgs which are under memory.min protection or > > > > something like a total number of min protected memory ? > > > > > > Yes, this would likely help. But the main question really reamains, is > > > this really worth it? > > > > > > > If it doesn't cost too much, I think it is worth to do it. > > As the oom path is not the critical path, so adding some print info > > should not add much overhead. > > Generating a lot of output for the oom reports has been a real problem > in many deployments. So why not only print non-zero counters ? If some counters are 0, we don't print them, that can reduce the oom reports. Something like "isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0" can all be removed, and we consider them as zero by default. I mean we can optimze the OOM reports and only print the useful information to make it not be a problem in many deployments. > [...] > > > > I have said in the commit log, that we don't know why the file cache > > > > can't be reclaimed (when evictable is 0 and dirty is 0 as well.) > > > > > > And the counter argument is that this will not help you there much in > > > many large and much more common cases. > > > > > > I argue, and I might be wrong here so feel free to correct me, that the > > > reclaim protection guarantee (min) is something to be under admins > > > control. It shouldn't really happen nilly-willy because it has really > > > large consequences, the OOM including. So if there is a suspicious > > > amount of memory that could be reclaimed normally then the reclaim > > > protection is really the first suspect to go after. > > > -- > > > > I don't know whether it happens nilly-willy or not. > > It is a reclaim protection guarantee (so essentially an mlock like > thing) so it better have to be properly considered when used. > > > But if we all know that it may cause OOMs and it don't take too much > > effort to show it in the OOM messages, > > I do not think we are in agreement here. As mentioned above the oom > report is quite heavy already. So it should be other way around. There > should be a strong reason to add something more. A real use case where > not having that information is making debugging ooms considerably much > harder. > > -- > Michal Hocko > SUSE Labs