RE: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

> -----Original Message-----
> From: Peter Xu [mailto:peterx@xxxxxxxxxx]
> Sent: Monday, May 6, 2024 11:18 PM
> To: Gonglei (Arei) <arei.gonglei@xxxxxxxxxx>
> Cc: Daniel P. Berrangé <berrange@xxxxxxxxxx>; Markus Armbruster
> <armbru@xxxxxxxxxx>; Michael Galaxy <mgalaxy@xxxxxxxxxx>; Yu Zhang
> <yu.zhang@xxxxxxxxx>; Zhijian Li (Fujitsu) <lizhijian@xxxxxxxxxxx>; Jinpu Wang
> <jinpu.wang@xxxxxxxxx>; Elmar Gerdes <elmar.gerdes@xxxxxxxxx>;
> qemu-devel@xxxxxxxxxx; Yuval Shaia <yuval.shaia.ml@xxxxxxxxx>; Kevin Wolf
> <kwolf@xxxxxxxxxx>; Prasanna Kumar Kalever
> <prasanna.kalever@xxxxxxxxxx>; Cornelia Huck <cohuck@xxxxxxxxxx>;
> Michael Roth <michael.roth@xxxxxxx>; Prasanna Kumar Kalever
> <prasanna4324@xxxxxxxxx>; integration@xxxxxxxxxxx; Paolo Bonzini
> <pbonzini@xxxxxxxxxx>; qemu-block@xxxxxxxxxx; devel@xxxxxxxxxxxxxxxxx;
> Hanna Reitz <hreitz@xxxxxxxxxx>; Michael S. Tsirkin <mst@xxxxxxxxxx>;
> Thomas Huth <thuth@xxxxxxxxxx>; Eric Blake <eblake@xxxxxxxxxx>; Song
> Gao <gaosong@xxxxxxxxxxx>; Marc-André Lureau
> <marcandre.lureau@xxxxxxxxxx>; Alex Bennée <alex.bennee@xxxxxxxxxx>;
> Wainer dos Santos Moschetta <wainersm@xxxxxxxxxx>; Beraldo Leal
> <bleal@xxxxxxxxxx>; Pannengyuan <pannengyuan@xxxxxxxxxx>;
> Xiexiangyou <xiexiangyou@xxxxxxxxxx>
> Subject: Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling
> 
> On Mon, May 06, 2024 at 02:06:28AM +0000, Gonglei (Arei) wrote:
> > Hi, Peter
> 
> Hey, Lei,
> 
> Happy to see you around again after years.
> 
Haha, me too.

> > RDMA features high bandwidth, low latency (in non-blocking lossless
> > network), and direct remote memory access by bypassing the CPU (As you
> > know, CPU resources are expensive for cloud vendors, which is one of
> > the reasons why we introduced offload cards.), which TCP does not have.
> 
> It's another cost to use offload cards, v.s. preparing more cpu resources?
> 
Software and hardware offload converged architecture is the way to go for all cloud vendors 
(Including comprehensive benefits in terms of performance, cost, security, and innovation speed), 
it's not just a matter of adding the resource of a DPU card.

> > In some scenarios where fast live migration is needed (extremely short
> > interruption duration and migration duration) is very useful. To this
> > end, we have also developed RDMA support for multifd.
> 
> Will any of you upstream that work?  I'm curious how intrusive would it be
> when adding it to multifd, if it can keep only 5 exported functions like what
> rdma.h does right now it'll be pretty nice.  We also want to make sure it works
> with arbitrary sized loads and buffers, e.g. vfio is considering to add IO loads to
> multifd channels too.
> 

In fact, we sent the patchset to the community in 2021. Pls see:
https://lore.kernel.org/all/20210203185906.GT2950@work-vm/T/


> One thing to note that the question here is not about a pure performance
> comparison between rdma and nics only.  It's about help us make a decision
> on whether to drop rdma, iow, even if rdma performs well, the community still
> has the right to drop it if nobody can actively work and maintain it.
> It's just that if nics can perform as good it's more a reason to drop, unless
> companies can help to provide good support and work together.
> 

We are happy to provide the necessary review and maintenance work for RDMA
if the community needs it.

CC'ing Chuan Zheng.


Regards,
-Gonglei

_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux