On Thu, Oct 04, 2018 at 09:41:09PM +0000, Roman Gushchin wrote: > I was reported that on some of our machines containers were restarted > with OOM symptoms without an obvious reason. Despite there were almost > no memory pressure and plenty of page cache, MEMCG_OOM event was > raised occasionally, causing the container management software to > think, that OOM has happened. However, no tasks have been killed. > > The following investigation showed that the problem is caused by > a failing attempt to charge a high-order page. In such case, the > OOM killer is never invoked. As shown below, it can happen under > conditions, which are very far from a real OOM: e.g. there is plenty > of clean page cache and no memory pressure. > > There is no sense in raising an OOM event in this case, as it might > confuse a user and lead to wrong and excessive actions (e.g. restart > the workload, as in my case). > > Let's look at the charging path in try_charge(). If the memory usage > is about memory.max, which is absolutely natural for most memory cgroups, > we try to reclaim some pages. Even if we were able to reclaim > enough memory for the allocation, the following check can fail due to > a race with another concurrent allocation: > > if (mem_cgroup_margin(mem_over_limit) >= nr_pages) > goto retry; > > For regular pages the following condition will save us from triggering > the OOM: > > if (nr_reclaimed && nr_pages <= (1 << PAGE_ALLOC_COSTLY_ORDER)) > goto retry; > > But for high-order allocation this condition will intentionally fail. > The reason behind is that we'll likely fall to regular pages anyway, > so it's ok and even preferred to return ENOMEM. > > In this case the idea of raising MEMCG_OOM looks dubious. > > Fix this by moving MEMCG_OOM raising to mem_cgroup_oom() after > allocation order check, so that the event won't be raised for high > order allocations. This change doesn't affect regular pages allocation > and charging. > > Signed-off-by: Roman Gushchin <guro@xxxxxx> > Acked-by: David Rientjes <rientjes@xxxxxxxxxx> > Acked-by: Michal Hocko <mhocko@xxxxxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>