On Tue, 2020-09-08 at 13:31 -0700, Yang Shi wrote: > On Tue, Sep 8, 2020 at 1:14 PM Julius Hemanth Pitti <jpitti@xxxxxxxxx > > wrote: > > > > For non root CG, in try_charge(), we keep trying > > to charge until we succeed. On non-preemptive > > kernel, when we are OOM, this results in holding > > CPU forever. > > > > On SMP systems, this doesn't create a big problem > > because oom_reaper get a change to kill victim > > and make some free pages. However on a single-core > > CPU (or cases where oom_reaper pinned to same CPU > > where try_charge is executing), oom_reaper shall > > never get scheduled and we stay in try_charge forever. > > > > Steps to repo this on non-smp: > > 1. mount -t tmpfs none /sys/fs/cgroup > > 2. mkdir /sys/fs/cgroup/memory > > 3. mount -t cgroup none /sys/fs/cgroup/memory -o memory > > 4. mkdir /sys/fs/cgroup/memory/0 > > 5. echo 40M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes > > 6. echo $$ > /sys/fs/cgroup/memory/0/tasks > > 7. stress -m 5 --vm-bytes 10M --vm-hang 0 > > Isn't it the same problem solved by e3336cab2579 ("mm: memcg: fix > memcg reclaim soft lockup")? It has been in Linus's tree. Yes, indeed. I just tested with e3336cab2579, and it solved this problem. Thanks for pointing it out. > > > > > Signed-off-by: Julius Hemanth Pitti <jpitti@xxxxxxxxx> > > Acked-by: Roman Gushchin <guro@xxxxxx> > > --- > > > > Changes in v2: > > - Added comments. > > - Added "Acked-by: Roman Gushchin <guro@xxxxxx>". > > --- > > mm/memcontrol.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index cfa6cbad21d5..4f293bf8c7ed 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -2745,6 +2745,15 @@ static int try_charge(struct mem_cgroup > > *memcg, gfp_t gfp_mask, > > if (fatal_signal_pending(current)) > > goto force; > > > > + /* > > + * We failed to charge even after retries, give oom_reaper > > or > > + * other process a change to make some free pages. > > + * > > + * On non-preemptive, Non-SMP system, this is critical, > > else > > + * we keep retrying with no success, forever. > > + */ > > + cond_resched(); > > + > > /* > > * keep retrying as long as the memcg oom killer is able to > > make > > * a forward progress or bypass the charge if the oom > > killer > > -- > > 2.17.1 > > > >