From: Ursula Braun <ursula.braun@xxxxxxxxxx> Dave, this is V4 of my SMC-R patches: Since you are asking for a solution "100% in our own separate module with our own can of worms", we have to give up the transparent detection whether a communication peer can do SMC-R or not (this has been the purpose of the rejected TCP hooks). Instead, we want just the new self-contained SMC-R socket family added to the kernel. By the way, since August 2015 the SMC-R Informational RFC is no longer a draft, but published as RFC7609. V4 changes: 1. Remove tcp patches supporting TCP experimental options 2. Remove references to tcp_sock syn_smc flag in smc-code, since TCP experimental options are not supported by the Linux-tcp. 3. clc_wait_msg() simplified V3 changes: 1. Avoid adding of new space for smc-related bits in the tcp structures. 2. Make the smc feature to be nearly zero cost using Static Keys / jump labels 3. Increase / decrease smc static key in the smc-code 4. Make sure the next-to-last patch does not break the build 5. Additional pnet table checking V2 changes: 1. activate tcp changes for CONFIG_AFSMC only (as suggested by Eric Dumazet) 2. add additional hook in net/core/sock.c 3. fix bitfield endianness problem Thanks, Ursula In 2013, IBM introduced an optimized communications solution for the IBM zEnterprise EC12 and BC12 (s390 in Linux terminology) that is comprised of the IBM 10GbE RoCE Express feature with Shared Memory Communications-RDMA (SMC-R) protocol [1]. SMC-R is designed for the enterprise data center environment and is an open protocol as specified in the informational RFC7609 [2]. It has been published in August 2015. Another implementation of this protocol is available since 2013 with IBM z/OS Version 2 Release 1. SMC-R provides a “sockets over RDMA” solution that leverages industry standard RDMA over Converged Ethernet (RoCE) technology. IBM has developed a Linux implementation of the SMC-R standard. A new socket protocol family AF_SMC is introduced. A preload library can be used to enable TCP-based applications to use SMC-R without changes. Key aspects of SMC-R are: 1. Provides optimized performance compared to standard TCP/IP over Ethernet within the data center for both request/response (latency) and streaming workloads (CPU savings) [3]. Initial benchmarks on Linux on x86 processors have shown latency reduction of up to 52% with a throughput gain of 111% using SMC-R vs TCP for request/response message patterns (10 concurrent TCP connections with 16KB messages) and CPU savings of up to 69% for streaming data patterns (single TCP connection with 20MB of data in one direction). [1] is currently updated to contain more detailed information on Linux and performance. 2. In order to preserve the traditional network administrative model the SMC-R protocol ties into the existing IP addresses and uses TCP's handshake to establish connections. This allows existing management tools and security infrastructure to control the creation of SMC connections. 3. The SMC-R protocol logically bonds multiple RoCE adapters together providing redundancy with transparent fail-over for improved high availability, increased bandwidth and load balancing across multiple RDMA-capable devices. Without the rejected TCP Experimental Options the following aspects are restricted; alternate solutions are in discussion. 4. Due to its handshake protocol, SMC-R is compatible with (transparent to) existing TCP connection load balancers that are commonly used in the enterprise data center environment for multi-tier application workloads. 5. SMC-R's handshake protocol allows for transparent fallback to TCP/IP, should one of the peers not be capable of the protocol. Additional SMC-R overview and reference materials are available [1]. The SMC-R “rendezvous" protocol eliminates the need for RDMA-CM and the exchange occurs through an initial TCP connection. Building on a TCP connection to establish an SMC-R connection solves many key requirements. The rendezvous process occurs now in 1 phase only: 1. TCP/IP 3-way exchange with TCP experimental options is skipped. 2. SMC-R 3-way exchange: It is assumed both partners indicate SMC-R capability. Then at the completion of the 3-way TCP handshake the SMC-R layers in each peer take control of the TCP connection and exchange their RDMA credentials. If this 3-way exchange completes successfully the connection continues using SMC-R. If the exchange is not successful the connections falls back to standard TCP/IP. References: [1] SMC-R Overview and Reference Materials: http://www-01.ibm.com/software/network/commserver/SMCR/ [2] SMC-R Informational RFC: https://tools.ietf.org/rfc/rfc7609 [3] Linux SMC-R Overview and Performance Summary (archs x86 and s390): http://www-01.ibm.com/software/network/commserver/SMCR/ The patch series is prepared to apply to net-next and consists of these parts: 1. net: definitions to establish new socket family 2. net/smc: new socket family In the future, SMC-R will be enhanced to cover: - alternate SMC-capability detection - IPv6 support - Tracing - Statistics support shortlog: Ursula Braun (2): net: introduce socket family constants smc: introduce socket family AF_SMC -- 2.3.8 -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html