On Thu, Feb 24, 2022 at 9:34 AM Daniel Dao <dqminh@xxxxxxxxxxxxxx> wrote: > [...] > > Thanks for testing. At the moment I am suspecting the async worker is > > not getting the CPU. Can you share your CONFIG_HZ setting? Also can you > > try the following patch and see if that helps otherwise keep halving the > > delay (i.e. 2HZ -> HZ -> HZ/2 -> ...) and find at what value the issue > > you are seeing get resolved? > > We have CONFIG_HZ=1000. We can try to increase the frequency of async flush, but > that seems like a not great bandaid. Is it possible to remove > mem_cgroup_flush_stats() > from workingset_refault, or at least scope it down to some targeted cgroup so > we don't need to flush from root with potentially large sets of > cgroups to walk ? I actually wanted to know what would be the good frequency of rstat flushing for your workload. I am not planning to propose to change the default frequency. Anyways I am thinking of introducing mem_cgroup_flush_stats_asyn() which will schedule flush_memcg_stats_dwork() without delay. Let me prepare the patch based on 5.15-stable for you to test.