Re: [PATCH v2] mm: don't raise MEMCG_OOM event due to failed high-order allocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 04, 2018 at 09:41:09PM +0000, Roman Gushchin wrote:
> I was reported that on some of our machines containers were restarted
> with OOM symptoms without an obvious reason. Despite there were almost
> no memory pressure and plenty of page cache, MEMCG_OOM event was
> raised occasionally, causing the container management software to
> think, that OOM has happened. However, no tasks have been killed.
> 
> The following investigation showed that the problem is caused by
> a failing attempt to charge a high-order page. In such case, the
> OOM killer is never invoked. As shown below, it can happen under
> conditions, which are very far from a real OOM: e.g. there is plenty
> of clean page cache and no memory pressure.
> 
> There is no sense in raising an OOM event in this case, as it might
> confuse a user and lead to wrong and excessive actions (e.g. restart
> the workload, as in my case).
> 
> Let's look at the charging path in try_charge(). If the memory usage
> is about memory.max, which is absolutely natural for most memory cgroups,
> we try to reclaim some pages. Even if we were able to reclaim
> enough memory for the allocation, the following check can fail due to
> a race with another concurrent allocation:
> 
>     if (mem_cgroup_margin(mem_over_limit) >= nr_pages)
>         goto retry;
> 
> For regular pages the following condition will save us from triggering
> the OOM:
> 
>    if (nr_reclaimed && nr_pages <= (1 << PAGE_ALLOC_COSTLY_ORDER))
>        goto retry;
> 
> But for high-order allocation this condition will intentionally fail.
> The reason behind is that we'll likely fall to regular pages anyway,
> so it's ok and even preferred to return ENOMEM.
> 
> In this case the idea of raising MEMCG_OOM looks dubious.
> 
> Fix this by moving MEMCG_OOM raising to mem_cgroup_oom() after
> allocation order check, so that the event won't be raised for high
> order allocations. This change doesn't affect regular pages allocation
> and charging.
> 
> Signed-off-by: Roman Gushchin <guro@xxxxxx>
> Acked-by: David Rientjes <rientjes@xxxxxxxxxx>
> Acked-by: Michal Hocko <mhocko@xxxxxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx>

Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux