On Thu, 23 Jun 2022 18:50:07 -0400 Xin Long wrote: > From the perf data, we can see __sk_mem_reduce_allocated() is the one > using CPU the most more than before, and mem_cgroup APIs are also > called in this function. It means the mem cgroup must be enabled in > the test env, which may explain why I couldn't reproduce it. > > The Commit 4890b686f4 ("net: keep sk->sk_forward_alloc as small as > possible") uses sk_mem_reclaim(checking reclaimable >= PAGE_SIZE) to > reclaim the memory, which is *more frequent* to call > __sk_mem_reduce_allocated() than before (checking reclaimable >= > SK_RECLAIM_THRESHOLD). It might be cheap when > mem_cgroup_sockets_enabled is false, but I'm not sure if it's still > cheap when mem_cgroup_sockets_enabled is true. > > I think SCTP netperf could trigger this, as the CPU is the bottleneck > for SCTP netperf testing, which is more sensitive to the extra > function calls than TCP. > > Can we re-run this testing without mem cgroup enabled? FWIW I defer to Eric, thanks a lot for double checking the report and digging in!