On 2020/8/26 下午4:11, Michal Hocko wrote: > On Wed 26-08-20 15:27:02, Xunlei Pang wrote: >> We've met softlockup with "CONFIG_PREEMPT_NONE=y", when >> the target memcg doesn't have any reclaimable memory. > > Do you have any scenario when this happens or is this some sort of a > test case? It can happen on tiny guest scenarios. > >> It can be easily reproduced as below: >> watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204] >> CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12 >> Call Trace: >> shrink_lruvec+0x49f/0x640 >> shrink_node+0x2a6/0x6f0 >> do_try_to_free_pages+0xe9/0x3e0 >> try_to_free_mem_cgroup_pages+0xef/0x1f0 >> try_charge+0x2c1/0x750 >> mem_cgroup_charge+0xd7/0x240 >> __add_to_page_cache_locked+0x2fd/0x370 >> add_to_page_cache_lru+0x4a/0xc0 >> pagecache_get_page+0x10b/0x2f0 >> filemap_fault+0x661/0xad0 >> ext4_filemap_fault+0x2c/0x40 >> __do_fault+0x4d/0xf9 >> handle_mm_fault+0x1080/0x1790 >> >> It only happens on our 1-vcpu instances, because there's no chance >> for oom reaper to run to reclaim the to-be-killed process. >> >> Add cond_resched() in such cases at the beginning of shrink_lruvec() >> to give up the cpu to others. > > I do agree that we need a cond_resched but I cannot say I would like > this patch. The primary reason is that it doesn't catch all cases when > the memcg is not reclaimable. For example it wouldn't reschedule if the > memcg is protected by low/min. What do you think about this instead? > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 99e1796eb833..bbdc38b58cc5 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2617,6 +2617,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) > > mem_cgroup_calculate_protection(target_memcg, memcg); > > + cond_resched(); > + > if (mem_cgroup_below_min(memcg)) { > /* > * Hard protection. > > This should catch both cases. I even have a vague recollection that > somebody has proposed something in that direction but I cannot remember > what has happened with that patch. > It's the endless "retry" in try_charge() that caused the softlockup, and I think mem_cgroup_protected() will eventually return MEMCG_PROT_NONE, and shrink_node_memcgs() will call shrink_lruvec() for memcg self-reclaim cases, so it's not a problem here. But adding cond_resched() at upper shrink_node_memcgs() may eliminate potential similar issues, I have no objection with this approach. I tested it and works well, will send v2 later.