Re: [PATCH net-next v3 00/18] net/smc: implement virtual ISM extension and loopback-ism

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 21.09.23 15:19, Wen Gu wrote:
Hi, all

# Background

SMC-D is now used in IBM z with ISM function to optimize network interconnect
for intra-CPC communications. Inspired by this, we try to make SMC-D available
on the non-s390 architecture through a software-simulated virtual ISM device,
such as loopback-ism device here, to accelerate inter-process or inter-containers
communication within the same OS.

# Design

This patch set includes 4 parts:

  - Patch #1-#3: decouple ISM device hard code from SMC-D stack.
  - Patch #4-#8: implement virtual ISM extension defined in SMCv2.1.
  - Patch #9-#13: implement loopback-ism device.
  - Patch #14-#18: memory copy optimization for the case using loopback.

The loopback-ism device is designed as a kernel device and not be limited to
a specific net namespace, ends of both inter-process connection (1/1' in diagram
below) or inter-container connection (2/2' in diagram below) will find that peer
shares the same loopback-ism device during the CLC handshake. Then loopback-ism
device will be chosen.

  Container 1 (ns1)                              Container 2 (ns2)
  +-----------------------------------------+    +-------------------------+
  | +-------+      +-------+      +-------+ |    |        +-------+        |
  | | App A |      | App B |      | App C | |    |        | App D |<-+     |
  | +-------+      +---^---+      +-------+ |    |        +-------+  |(2') |
  |     |127.0.0.1 (1')|             |192.168.0.11       192.168.0.12|     |
  |  (1)|   +--------+ | +--------+  |(2)   |    | +--------+   +--------+ |
  |     `-->|   lo   |-` |  eth0  |<-`      |    | |   lo   |   |  eth0  | |
  +---------+--|---^-+---+-----|--+---------+    +-+--------+---+-^------+-+
               |   |           |                                  |
  Kernel       |   |           |                                  |
  +----+-------v---+-----------v----------------------------------+---+----+
  |    |                            TCP                               |    |
  |    |                                                              |    |
  |    +--------------------------------------------------------------+    |
  |                                                                        |
  |                           +--------------+                             |
  |                           | smc loopback |                             |
  +---------------------------+--------------+-----------------------------+


loopback-ism device allocs RMBs and sndbufs for each connection peer and 'moves'
data from sndbuf at one end to RMB at the other end. Since communication occurs
within the same kernel, the sndbuf can be mapped to peer RMB so that the data
copy in loopback-ism case can be avoided.

  Container 1 (ns1)                              Container 2 (ns2)
  +-----------------------------------------+    +-------------------------+
  | +-------+      +-------+      +-------+ |    |        +-------+        |
  | | App A |      | App B |      | App C | |    |        | App D |        |
  | +-------+      +--^----+      +-------+ |    |        +---^---+        |
  |       |           |               |     |    |            |            |
  |   (1) |      (1') |           (2) |     |    |       (2') |            |
  |       |           |               |     |    |            |            |
  +-------|-----------|---------------|-----+    +------------|------------+
          |           |               |                       |
  Kernel  |           |               |                       |
  +-------|-----------|---------------|-----------------------|------------+
  | +-----v-+      +-------+      +---v---+               +-------+        |
  | | snd A |-+    | RMB B |<--+  | snd C |-+          +->| RMB D |        |
  | +-------+ |    +-------+   |  +-------+ |          |  +-------+        |
  | +-------+ |    +-------+   |  +-------+ |          |  +-------+        |
  | | RMB A | |    | snd B |   |  | RMB C | |          |  | snd D |        |
  | +-------+ |    +-------+   |  +-------+ |          |  +-------+        |
  |           |               +-------------v+         |                   |
  |           +-------------->| smc loopback |---------+                   |
  +---------------------------+--------------+-----------------------------+

# Benchmark Test

  * Test environments:
       - VM with Intel Xeon Platinum 8 core 2.50GHz, 16 GiB mem.
       - SMC sndbuf/RMB size 1MB.

  * Test object:
       - TCP: run on TCP loopback.
       - domain: run on UNIX domain.
       - SMC lo: run on SMC loopback device.

1. ipc-benchmark (see [1])

  - ./<foo> -c 1000000 -s 100

                        TCP              domain              SMC-lo
Message
rate (msg/s)         78855     107621(+36.41%)     153351(+94.47%)

2. sockperf

  - serv: <smc_run> taskset -c <cpu> sockperf sr --tcp
  - clnt: <smc_run> taskset -c <cpu> sockperf { tp | pp } --tcp --msg-size={ 64000 for tp | 14 for pp } -i 127.0.0.1 -t 30

                             TCP                  SMC-lo
Bandwidth(MBps)        5169.250       8007.080(+54.89%)
Latency(us)               6.122          3.174(-48.15%)

3. nginx/wrk

  - serv: <smc_run> nginx
  - clnt: <smc_run> wrk -t 8 -c 1000 -d 30 http://127.0.0.1:80

                            TCP                   SMC-lo
Requests/s           197432.19       261056.09(+32.22%)

4. redis-benchmark

  - serv: <smc_run> redis-server
  - clnt: <smc_run> redis-benchmark -h 127.0.0.1 -q -t set,get -n 400000 -c 200 -d 1024

                            TCP                   SMC-lo
GET(Requests/s)       86244.07       122025.62(+41.48%)
SET(Requests/s)       86749.08       120048.02(+38.38%)

[1] https://github.com/goldsborough/ipc-bench

v2->v3:
  - Fix build warning of patch#1 and patch#10.

v1->v2:
  - Fix build error on s390 arch.

Wen Gu (18):
   net/smc: decouple ism_dev from SMC-D device dump
   net/smc: decouple ism_dev from SMC-D DMB registration
   net/smc: extract v2 check helper from SMC-D device registration
   net/smc: support SMCv2.x supplemental features negotiation
   net/smc: reserve CHID range for SMC-D virtual device
   net/smc: extend GID to 128bits for virtual ISM device
   net/smc: disable SEID on non-s390 architecture
   net/smc: enable virtual ISM device feature bit
   net/smc: introduce SMC-D loopback device
   net/smc: implement ID-related operations of loopback
   net/smc: implement some unsupported operations of loopback
   net/smc: implement DMB-related operations of loopback
   net/smc: register loopback device as SMC-Dv2 device
   net/smc: add operation for getting DMB attribute
   net/smc: add operations for DMB attach and detach
   net/smc: avoid data copy from sndbuf to peer RMB in SMC-D
   net/smc: modify cursor update logic when sndbuf mapped to RMB
   net/smc: add interface implementation of loopback device

  drivers/s390/net/ism_drv.c |  19 +-
  include/net/smc.h          |  32 ++-
  include/uapi/linux/smc.h   |   3 +
  net/smc/Makefile           |   2 +-
  net/smc/af_smc.c           |  73 +++++--
  net/smc/smc.h              |   7 +
  net/smc/smc_cdc.c          |  56 ++++--
  net/smc/smc_cdc.h          |   1 +
  net/smc/smc_clc.c          |  56 ++++--
  net/smc/smc_clc.h          |  10 +-
  net/smc/smc_core.c         | 108 +++++++++-
  net/smc/smc_core.h         |   9 +-
  net/smc/smc_diag.c         |   6 +-
  net/smc/smc_ism.c          | 100 +++++++---
  net/smc/smc_ism.h          |  24 ++-
  net/smc/smc_loopback.c     | 483 +++++++++++++++++++++++++++++++++++++++++++++
  net/smc/smc_loopback.h     |  52 +++++
  net/smc/smc_pnet.c         |   4 +-
  18 files changed, 946 insertions(+), 99 deletions(-)
  create mode 100644 net/smc/smc_loopback.c
  create mode 100644 net/smc/smc_loopback.h


Hi Wen,

Thank you for the effort!
You can find my comments in the respective patches. One general question from our team, could you please add a Kconfig option to turn off/on loopback-ism?

BTW, I'm in vacation next week, my colleagues will follow on the answer and update.

Thanks,
Wenjia




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux