On Fri, Sep 04, 2015 at 05:00:11PM -0400, Tejun Heo wrote: > Currently, try_charge() tries to reclaim memory synchronously when the > high limit is breached; however, if the allocation doesn't have > __GFP_WAIT, synchronous reclaim is skipped. If a process performs > only speculative allocations, it can blow way past the high limit. > This is actually easily reproducible by simply doing "find /". > slab/slub allocator tries speculative allocations first, so as long as > there's memory which can be consumed without blocking, it can keep > allocating memory regardless of the high limit. > > This patch makes try_charge() always punt the over-high reclaim to the > return-to-userland path. If try_charge() detects that high limit is > breached, it adds the overage to current->memcg_nr_pages_over_high and > schedules execution of mem_cgroup_handle_over_high() which performs > synchronous reclaim from the return-to-userland path. > > As long as kernel doesn't have a run-away allocation spree, this > should provide enough protection while making kmemcg behave more > consistently. Another good thing about such an approach is that it copes with prio inversion. Currently, a task with small memory.high might issue memory.high reclaim on kmem charge with a bunch of various locks held. If a task with a big value of memory.high needs any of these locks, it'll have to wait until the low prio task finishes reclaim and releases the locks. By handing over reclaim to task_work whenever possible we might avoid this issue and improve overall performance. > > v2: - Switched to reclaiming only the overage caused by current rather > than the difference between usage and high as suggested by > Michal. > - Don't record the memcg which went over high limit. This makes > exit path handling unnecessary. Dropped. > - Drop mentions of avoiding high stack usage from description as > suggested by Vladimir. max limit still triggers direct reclaim. > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxxxx> > Cc: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> Reviewed-by: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>