> -----Original Message----- > From: Zhang, Cathy > Sent: Wednesday, May 10, 2023 3:04 PM > To: Shakeel Butt <shakeelb@xxxxxxxxxx>; Chen, Tim C > <tim.c.chen@xxxxxxxxx> > Cc: edumazet@xxxxxxxxxx; davem@xxxxxxxxxxxxx; kuba@xxxxxxxxxx; > pabeni@xxxxxxxxxx; Brandeburg, Jesse <jesse.brandeburg@xxxxxxxxx>; > Srinivas, Suresh <suresh.srinivas@xxxxxxxxx>; You, Lizhen > <Lizhen.You@xxxxxxxxx>; eric.dumazet@xxxxxxxxx; netdev@xxxxxxxxxxxxxxx; > linux-mm@xxxxxxxxx; cgroups@xxxxxxxxxxxxxxx > Subject: RE: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a proper > size > > > > > -----Original Message----- > > From: Shakeel Butt <shakeelb@xxxxxxxxxx> > > Sent: Wednesday, May 10, 2023 2:18 AM > > To: Chen, Tim C <tim.c.chen@xxxxxxxxx> > > Cc: Zhang, Cathy <cathy.zhang@xxxxxxxxx>; edumazet@xxxxxxxxxx; > > davem@xxxxxxxxxxxxx; kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; > Brandeburg, > > Jesse <jesse.brandeburg@xxxxxxxxx>; Srinivas, Suresh > > <suresh.srinivas@xxxxxxxxx>; You, Lizhen <lizhen.you@xxxxxxxxx>; > > eric.dumazet@xxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; > > cgroups@xxxxxxxxxxxxxxx > > Subject: Re: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a > > proper size > > > > On Tue, May 9, 2023 at 11:04 AM Chen, Tim C <tim.c.chen@xxxxxxxxx> > wrote: > > > > > > >> > > > >> Run memcached with memtier_benchamrk to verify the optimization > > > >> fix. 8 server-client pairs are created with bridge network on > > > >> localhost, server and client of the same pair share 28 logical CPUs. > > > >> > > > > >Results (Average for 5 run) > > > > >RPS (with/without patch) +2.07x > > > > > > > > > > > >Do you have regression data from any production workload? Please > > > >keep > > in mind that many times we (MM subsystem) accepts the regressions of > > microbenchmarks over complicated optimizations. So, if there is a real > > production regression, please be very explicit about it. > > > > > > Though memcached is actually used by people in production. So this > > > isn't > > an unrealistic scenario. > > > > > > > Yes, memcached is used in production but I am not sure anyone runs 8 > > pairs of server and client on the same machine for production > > workload. Anyways, we can discuss, if needed, about the practicality > > of the benchmark after we have some impactful memcg optimizations. > > The test is run on platform with 224 CPUs (HT enabled). It's not a must to run > 8 pairs, the memcg charge hot paths can be observed if we run only one pair > but with more CPUs. Leverage all CPU resources on TCP connection to stress > contentions. If we run less server-client pairs (<= 3), and each pair is with 28 CPUs shared, that means <=84 CPUs actually run, there is no obvious memcg charge overhead observed. But when we run more than 112 CPUs (>= 4 pairs) to stress the system with TCP memory allocation, memcg charge will be the bottleneck. > > > > > > Tim