On Mon, May 4, 2020 at 8:46 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > On Mon 04-05-20 20:34:01, Yafang Shao wrote: > > On Mon, May 4, 2020 at 4:18 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > > > [It would be really great if a newer version was posted only after there > > > was a wider consensus on the approach.] > > > > > > On Mon 04-05-20 00:26:21, Yafang Shao wrote: > > > > Recently Shakeel reported a issue which also confused me several months > > > > earlier. Bellow is his report - > > > > Lowering memory.max can trigger an oom-kill if the reclaim does not > > > > succeed. However if oom-killer does not find a process for killing, it > > > > dumps a lot of warnings. > > > > Deleting a memcg does not reclaim memory from it and the memory can > > > > linger till there is a memory pressure. One normal way to proactively > > > > reclaim such memory is to set memory.max to 0 just before deleting the > > > > memcg. However if some of the memcg's memory is pinned by others, this > > > > operation can trigger an oom-kill without any process and thus can log a > > > > lot of un-needed warnings. So, ignore all such warnings from memory.max. > > > > > > > > A better way to avoid this issue is to avoid trying to kill a process if > > > > memcg is not populated. > > > > Note that OOM is different from OOM kill. OOM is a status that the > > > > system or memcg is out of memory, while OOM kill is a result that a > > > > process inside this memcg is killed when this memcg is in OOM status. > > > > > > Agreed. > > > > > > > That is the same reason why there're both MEMCG_OOM event and > > > > MEMCG_OOM_KILL event. If we have already known that there's nothing to > > > > kill, i.e. the memcg is not populated, then we don't need a try. > > > > > > OK, but you are not explaining why a silent failure is really better > > > than no oom report under oom situation. With your patch, there is > > > no failure reported to the user and there is also no sign that there > > > might be a problem that memcg leaves memory behind that is not bound to > > > any (killable) process. This could be an important information. > > > > > > > That is not a silent failure. An oom event will be reported. > > The user can get this event by memory.events or memory.events.local if > > he really care about it. > > You are right. The oom situation will be reported (somehow) but the > reason why no task has been killed might be several and there is no way > to report no eligible tasks. > > > Especially when the admin set memory.max to 0 to drop all the caches, > > many oom logs are a noise, besides that there are some side effect, > > for example two many oom logs printed to a slow console may cause some > > latency spike. > > But the oom situation and the oom report is simply something an admin > has to expect especially when the hard limit is set to 0. With kmem > accounting there is no guarantee that the target will be met. I'm always wondering that why not moving the kmem from this memcg to the root_mem_cgroup in this situation ? Then this memcg can be easily reclaimed. > > > > > > > Besides that I really do not see any actual problem that this would be > > > fixing. > > > > Avoid printing two many oom logs. > > There is only a single oom report printed so I disagree this is really a > proper justification. > > Unless you can come up with a better justification I am against this > patch. It unnecessarily reduce debugging tools while it doesn't really > provide any huge advantage. Changing the hard limit to impossible target > is known to trigger the oom kernel and the oom report is a part of that. > If the oom report is too noisy then we can discuss on how to make it > more compact but making ad-hoc exceptions like this one is not a good > solution. > -- No better justification yet. But I think more memcg users will complaining about it. -- Thanks Yafang