On Mon, Nov 22, 2021 at 04:08:37PM +0100, Karsten Graul wrote: > On 22/11/2021 14:42, Tony Lu wrote: > > SMC uses smc->sk.sk_{rcv|snd}buf to create buffer for send buffer or > > RMB. And the values of buffer size inherits from clcsock. The clcsock is > > a TCP sock which is initiated during SMC connection startup. > > > > The inherited buffer size doesn't fit SMC well. TCP provides two sysctl > > knobs to tune r/w buffers, net.ipv4.tcp_{r|w}mem, and SMC use the default > > value from TCP. The buffer size is tuned for TCP, but not fit SMC well > > in some scenarios. For example, we need larger buffer of SMC for high > > throughput applications, and smaller buffer of SMC for saving contiguous > > memory. We need to adjust the buffer size apart from TCP and not to > > disturb TCP. > > > > This unbinds buffer size which inherits from clcsock, and provides > > sysctl knobs to adjust buffer size independently. These knobs can be > > tuned with different values for different net namespaces for performance > > and flexibility. > > > > Signed-off-by: Tony Lu <tonylu@xxxxxxxxxxxxxxxxx> > > Reviewed-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx> > > --- > > To activate SMC for existing programs usually the smc_run command or the > preload library (both from the smc-tools package) are used. > This commit introduced support to set the send and recv window sizes > using command line parameters or environment variables: > > https://github.com/ibm-s390-linux/smc-tools/commit/59bfb99c588746f7dca1b3c97fd88f3f7cbc975f Hi Graul, Thanks for your advice. We are using smc-tools, it is a very useful tool, and we also use smc_run or LD_PRELOAD to help our applications to replace with SMC from TCP. There are some differences to use SMC in our environment. The followings are our scenarios to use SMC: 1. Transparent acceleration This approach is widely used in our environment. The main idea of transparent acceleration is not to touch the applications. The applications are usually pre-compiled and pre-packaged containers or ECS, which means the binary and the binary needed environments, like glibc and other libraries are bundled as bootstrap containers. So it is hard to inject the smc_run or LD_PRELOAD into application's containers runtime. To solve this issue, we developed a set of patches to replace the AF_INET / SOCK_STREAM with AF_SMC / SMCPROTO_SMC{6} by configuration. So that we can control acceleration in kernel without any other changes in user-space, and won't break our application containers and publish workflow. These patches are still improving for upstream. 2. Use SMC explicitly This approach is very straightforward. Applications just create sockets using AF_SMC and SMCPROTO_SMC{6}, and SMC works fine. However, most of applications don't want to bind tightly to single SMC protocol. Most of them take into account compatibility, stability and flexibility. > Why another way to manipulate these sizes? > Your solution would stop applications to set these values. I didn't understand it clearly about stopping applications to set there values. IMHO, this RFC introduces two knobs for snd/rcvbuf. During the following stages, applications can set these values as expected. 1. SMC module or per-net-namespace initialized: sysctl_{w|r}mem_default initialized in smc_net_init() when current net namespace initialized. The default values of SMC inherit from TCP, and clcsock use TCP's configures. 2. create SMC socket: smc_create() is called, and smc->sk.sk_{snd|rcv}buf are initialized from per-netns earlier. There are no different from before. Except for changing the values of TCP after SMC initialized, users can change them with newly added two knobs. 3. applications call setsockopt() to modify SO_SNDBUF or SO_RCVBUF: After smc_create() creates socket, applications can use setsockopt() to change the snd/rcvbuf, which called sock_setsockopt() directly. If fallback happened, setsockopt() would change clcsock's values. In the end, we hope to provide a flexibility approach to change SMC's buffer size only and don't disturb others. Sysctl are considered as a better way to maintain and easy to use for users. Thanks, Tony Lu