Re: [RFC net-next] net/smc:introduce 1RTT to SMC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 25, 2022 at 03:42:28PM +0200, Alexandra Winter wrote:
> 
> 
> On 24.05.22 09:49, Tony Lu wrote:
> > On Tue, May 24, 2022 at 02:52:07PM +0800, D. Wythe wrote:
> >> From: "D. Wythe" <alibuda@xxxxxxxxxxxxxxxxx>
> >>
> >> Hi Karsten,
> >>
> >> We are promoting SMC-R to the field of cloud computing, dues to the
> >> particularity of business on the cloud, the scale and the types of
> >> customer applications are unpredictable. As a participant of SMC-R, we
> >> also hope that SMC-R can cover more application scenarios. Therefore,
> >> many connection problems are exposed during this time. There are two
> >> main issue, one is that the establishment of a single connection takes
> >> longer than that of the TCP, another is that the degree of concurrency
> >> is low under multi-connection processing. This patch set is mainly
> >> optimized for the first issue, and the follow-up of the second issue
> >> will be synchronized in the future.
> >>
> >> In terms of communication process, under current implement, a TCP
> >> three-way handshake only needs 1-RTT time, while SMC-R currently
> >> requires 4-RTT times, including 2-RTT over IP(TCP handshake, SMC
> >> proposal & accept ) and 2-RTT over IB ( two times RKEY exchange), which
> >> is most influential factor affecting connection established time at the
> >> moment.
> >>
> >> We have noticed that single network interface card is mainstream on the
> >> cloud, dues to the advantages of cloud deployment costs and the cloud's
> >> own disaster recovery support. On the other hand, the emergence of RoCE
> >> LAG technology makes us no longer need to deal with multiple RDMA
> >> network interface cards by ourselves,  just like NIC bonding does. In
> >> Alibaba, Roce LAG is widely used for RDMA.
> > 
> > I think this is an interesting topic whether we need SMC-level link
> > redundancy. I agreed with that RoCE LAG and RDMA in cloud vendors handle
> > redundancy and failover in the lower layer, and do it transparently for
> > SMC.
> > 
> > So let's move on, if a RDMA device has redundancy ability, we could make
> > SMC simpler by give an option for user-space or based on the device
> > capability (if we have this flag). This allows under layer to ensure the
> > reliability of link group.
> > 
> > As RFC 7609 mentioned, we should do some extra work for reliability to
> > add link. It should be an optional work if the device have capability
> > for redundancy, and make link group simpler and faster (for the
> > so-called SMC-2RTT in this RFC).
> > 
> > I also notice that RFC 7609 is released on August 2015, which is earlier
> > than RoCE LAG. RoCE LAG is provided after ConnectX-3/ConnectX-3 Pro in
> > kernel 4.0, and is available in 2017. And cloud vendors' RDMA adapters,
> > such as Alibaba Elastic RDMA adapter in [1].
> > 
> > Given that, I propose whether the second link can be used as an option
> > in newly created link group. Also, if it is possible, RFC 7609 can be
> > updated or extend it for this nowadays case.
> > 
> > Looking forward for your message, Karsten, D. Wythe and folks.
> > 
> > [1] https://lore.kernel.org/linux-rdma/20220523075528.35017-1-chengyou@xxxxxxxxxxxxxxxxx/
> > 
> > Thanks,
> > Tony Lu
> >  
> Thank you D. Wythe for your proposals, the prototype and measurements.
> They sound quite promising to us.
> 
> We need to carefully evaluate them and make sure everything is compatible
> with the existing implementations of SMC-D and SMC-R v1 and v2. In the
> typical s390 environment ROCE LAG is propably not good enough, as the card
> is still a single point of failure. So your ideas need to be compatible
> with link redundancy. We also need to consider that the extension of the
> protocol does not block other desirable extensions.
> 
> Your prototype is very helpful for the understanding. Before submitting any
> code patches to net-next, we should agree on the details of the protocol
> extension. Maybe you could formulate your proposal in plain text, so we can
> discuss it here? 
> 
> We also need to inform you that several public holidays are upcoming in the
> next weeks and several of our team will be out for summer vacation, so please
> allow for longer response times.
> 
> Kind regards
> Alexandra Winter
> 
It's glad to hear this. This gave us a lot of confidence to insist on
it, thank you.

Cheers,
Tony Lu



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux