RE: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a proper size

"Zhang, Cathy" <cathy.zhang@xxxxxxxxx> · Wed, 10 May 2023 07:21:29 +0000

> -----Original Message-----
> From: Shakeel Butt <shakeelb@xxxxxxxxxx>
> Sent: Wednesday, May 10, 2023 1:58 AM
> To: Zhang, Cathy <cathy.zhang@xxxxxxxxx>; Linux MM <linux-
> mm@xxxxxxxxx>; Cgroups <cgroups@xxxxxxxxxxxxxxx>
> Cc: Eric Dumazet <edumazet@xxxxxxxxxx>; Paolo Abeni
> <pabeni@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; kuba@xxxxxxxxxx;
> Brandeburg, Jesse <jesse.brandeburg@xxxxxxxxx>; Srinivas, Suresh
> <suresh.srinivas@xxxxxxxxx>; Chen, Tim C <tim.c.chen@xxxxxxxxx>; You,
> Lizhen <lizhen.you@xxxxxxxxx>; eric.dumazet@xxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH net-next 1/2] net: Keep sk->sk_forward_alloc as a proper
> size
> 
> On Tue, May 9, 2023 at 8:07 AM Zhang, Cathy <cathy.zhang@xxxxxxxxx>
> wrote:
> >
> [...]
> > >
> > > Something must be wrong in your setup, because the only small issue
> > > that was noticed was the memcg one that Shakeel solved last year.
> >
> > As mentioned in commit log, the test is to create 8 memcached-memtier
> > pairs on the same host, when server and client of the same pair
> > connect to the same CPU socket and share the same CPU set (28 CPUs),
> > the memcg overhead is obviously high as shown in commit log. If they
> > are set with different CPU set from separate CPU socket, the overhead
> > is not so high but still observed.  Here is the server/client command in our
> test:
> > server:
> > memcached -p ${port_i} -t ${threads_i} -c 10240
> > client:
> > memtier_benchmark --server=${memcached_id} --port=${port_i} \
> > --protocol=memcache_text --test-time=20 --threads=${threads_i} \ -c 1
> > --pipeline=16 --ratio=1:100 --run-count=5
> >
> > So, is there anything wrong you see?
> >
> 
> What is the memcg hierarchy of this workload? Is each server and client
> processes running in their own memcg? How many levels of memcgs? Are
> you setting memory.max and memory.high to some value? Also how are you
> limiting the processes to CPUs? cpusets?

Here is the full command to start memcached instance:

docker run -d --name ${memcached_name} --privileged --memory 1G --network bridge \
-p ${port_i}:${port_i} ${cpu_pinning_s[set]} memcached memcached -p ${port_i} \
-t ${threads_i} -c 10240

We have a script to get CPU set from the same NUMA node, both CPU count and thread
count for each instance are equal to Num(system online CPUs) / Num(memcached instances).
That is, if we run 8 memcached instances, 224 / 8 = 28, so each instance will get 28 CPUs and
28 threads assigned.

Here is the full command to start memtier instance:
docker run --rm --network bridge ${cpu_pinning_s[set]} --memory 1G \
redislabs/memtier_benchmark memtier_benchmark --server=${memcached_id} --port=${port_i} \
--protocol=memcache_text --test-time=20 --threads=${threads_i} -c 1 --pipeline=16 --ratio=1:100 \
--run-count=5 --hide-histogram

Each instance has the same CPU set as the server it connects to, and it has the same threads
count.

That is all for server and client settings.