On Fri, Feb 25, 2022 at 9:09 AM Ivan Babrou <ivan@xxxxxxxxxxxxxx> wrote: > > On Fri, Feb 25, 2022 at 2:23 AM Daniel Dao <dqminh@xxxxxxxxxxxxxx> wrote: > > I think this looks good so far. > > I compared a flamegraph before to a flamegraph after (10s @ 99Hz on > 96-core CPU evenly loaded to ~75% in both cases). > > Before: 1.4% spent in workingset_refault. > After: 0.5% spent in flush_memcg_stats_dwork. > > The latter is all in kworkers (as expected), while the former is > spread across IO active tasks. > > This seems like a great first step that should be merged on its own. > It would be good to also do something to improve the CPU time spent in > delayed work, if possible, as 0.5% of on-CPU time is not a negligible > amount. I will send the official async patch soon and I agree that we need to improve rstat flush infrastructure as well but that would need more thought and time.