Hello, > -----Original Message----- > From: Peter Xu [mailto:peterx@xxxxxxxxxx] > Sent: Monday, May 6, 2024 11:18 PM > To: Gonglei (Arei) <arei.gonglei@xxxxxxxxxx> > Cc: Daniel P. Berrangé <berrange@xxxxxxxxxx>; Markus Armbruster > <armbru@xxxxxxxxxx>; Michael Galaxy <mgalaxy@xxxxxxxxxx>; Yu Zhang > <yu.zhang@xxxxxxxxx>; Zhijian Li (Fujitsu) <lizhijian@xxxxxxxxxxx>; Jinpu Wang > <jinpu.wang@xxxxxxxxx>; Elmar Gerdes <elmar.gerdes@xxxxxxxxx>; > qemu-devel@xxxxxxxxxx; Yuval Shaia <yuval.shaia.ml@xxxxxxxxx>; Kevin Wolf > <kwolf@xxxxxxxxxx>; Prasanna Kumar Kalever > <prasanna.kalever@xxxxxxxxxx>; Cornelia Huck <cohuck@xxxxxxxxxx>; > Michael Roth <michael.roth@xxxxxxx>; Prasanna Kumar Kalever > <prasanna4324@xxxxxxxxx>; integration@xxxxxxxxxxx; Paolo Bonzini > <pbonzini@xxxxxxxxxx>; qemu-block@xxxxxxxxxx; devel@xxxxxxxxxxxxxxxxx; > Hanna Reitz <hreitz@xxxxxxxxxx>; Michael S. Tsirkin <mst@xxxxxxxxxx>; > Thomas Huth <thuth@xxxxxxxxxx>; Eric Blake <eblake@xxxxxxxxxx>; Song > Gao <gaosong@xxxxxxxxxxx>; Marc-André Lureau > <marcandre.lureau@xxxxxxxxxx>; Alex Bennée <alex.bennee@xxxxxxxxxx>; > Wainer dos Santos Moschetta <wainersm@xxxxxxxxxx>; Beraldo Leal > <bleal@xxxxxxxxxx>; Pannengyuan <pannengyuan@xxxxxxxxxx>; > Xiexiangyou <xiexiangyou@xxxxxxxxxx> > Subject: Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling > > On Mon, May 06, 2024 at 02:06:28AM +0000, Gonglei (Arei) wrote: > > Hi, Peter > > Hey, Lei, > > Happy to see you around again after years. > Haha, me too. > > RDMA features high bandwidth, low latency (in non-blocking lossless > > network), and direct remote memory access by bypassing the CPU (As you > > know, CPU resources are expensive for cloud vendors, which is one of > > the reasons why we introduced offload cards.), which TCP does not have. > > It's another cost to use offload cards, v.s. preparing more cpu resources? > Software and hardware offload converged architecture is the way to go for all cloud vendors (Including comprehensive benefits in terms of performance, cost, security, and innovation speed), it's not just a matter of adding the resource of a DPU card. > > In some scenarios where fast live migration is needed (extremely short > > interruption duration and migration duration) is very useful. To this > > end, we have also developed RDMA support for multifd. > > Will any of you upstream that work? I'm curious how intrusive would it be > when adding it to multifd, if it can keep only 5 exported functions like what > rdma.h does right now it'll be pretty nice. We also want to make sure it works > with arbitrary sized loads and buffers, e.g. vfio is considering to add IO loads to > multifd channels too. > In fact, we sent the patchset to the community in 2021. Pls see: https://lore.kernel.org/all/20210203185906.GT2950@work-vm/T/ > One thing to note that the question here is not about a pure performance > comparison between rdma and nics only. It's about help us make a decision > on whether to drop rdma, iow, even if rdma performs well, the community still > has the right to drop it if nobody can actively work and maintain it. > It's just that if nics can perform as good it's more a reason to drop, unless > companies can help to provide good support and work together. > We are happy to provide the necessary review and maintenance work for RDMA if the community needs it. CC'ing Chuan Zheng. Regards, -Gonglei _______________________________________________ Devel mailing list -- devel@xxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx