Hi Gonglei, On Tue, May 28, 2024 at 11:06 AM Gonglei (Arei) <arei.gonglei@xxxxxxxxxx> wrote: > > Hi Peter, > > > -----Original Message----- > > From: Peter Xu [mailto:peterx@xxxxxxxxxx] > > Sent: Wednesday, May 22, 2024 6:15 AM > > To: Yu Zhang <yu.zhang@xxxxxxxxx> > > Cc: Michael Galaxy <mgalaxy@xxxxxxxxxx>; Jinpu Wang > > <jinpu.wang@xxxxxxxxx>; Elmar Gerdes <elmar.gerdes@xxxxxxxxx>; > > zhengchuan <zhengchuan@xxxxxxxxxx>; Gonglei (Arei) > > <arei.gonglei@xxxxxxxxxx>; Daniel P. Berrangé <berrange@xxxxxxxxxx>; > > Markus Armbruster <armbru@xxxxxxxxxx>; Zhijian Li (Fujitsu) > > <lizhijian@xxxxxxxxxxx>; qemu-devel@xxxxxxxxxx; Yuval Shaia > > <yuval.shaia.ml@xxxxxxxxx>; Kevin Wolf <kwolf@xxxxxxxxxx>; Prasanna > > Kumar Kalever <prasanna.kalever@xxxxxxxxxx>; Cornelia Huck > > <cohuck@xxxxxxxxxx>; Michael Roth <michael.roth@xxxxxxx>; Prasanna > > Kumar Kalever <prasanna4324@xxxxxxxxx>; Paolo Bonzini > > <pbonzini@xxxxxxxxxx>; qemu-block@xxxxxxxxxx; devel@xxxxxxxxxxxxxxxxx; > > Hanna Reitz <hreitz@xxxxxxxxxx>; Michael S. Tsirkin <mst@xxxxxxxxxx>; > > Thomas Huth <thuth@xxxxxxxxxx>; Eric Blake <eblake@xxxxxxxxxx>; Song > > Gao <gaosong@xxxxxxxxxxx>; Marc-André Lureau > > <marcandre.lureau@xxxxxxxxxx>; Alex Bennée <alex.bennee@xxxxxxxxxx>; > > Wainer dos Santos Moschetta <wainersm@xxxxxxxxxx>; Beraldo Leal > > <bleal@xxxxxxxxxx>; Pannengyuan <pannengyuan@xxxxxxxxxx>; > > Xiexiangyou <xiexiangyou@xxxxxxxxxx>; Fabiano Rosas <farosas@xxxxxxx> > > Subject: Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling > > > > On Fri, May 17, 2024 at 03:01:59PM +0200, Yu Zhang wrote: > > > Hello Michael and Peter, > > > > Hi, > > > > > > > > Exactly, not so compelling, as I did it first only on servers widely > > > used for production in our data center. The network adapters are > > > > > > Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 > > > 2-port Gigabit Ethernet PCIe > > > > Hmm... I definitely thinks Jinpu's Mellanox ConnectX-6 looks more reasonable. > > > > https://lore.kernel.org/qemu-devel/CAMGffEn-DKpMZ4tA71MJYdyemg0Zda15 > > wVAqk81vXtKzx-LfJQ@xxxxxxxxxxxxxx/ > > > > Appreciate a lot for everyone helping on the testings. > > > > > InfiniBand controller: Mellanox Technologies MT27800 Family > > > [ConnectX-5] > > > > > > which doesn't meet our purpose. I can choose RDMA or TCP for VM > > > migration. RDMA traffic is through InfiniBand and TCP through Ethernet > > > on these two hosts. One is standby while the other is active. > > > > > > Now I'll try on a server with more recent Ethernet and InfiniBand > > > network adapters. One of them has: > > > BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01) > > > > > > The comparison between RDMA and TCP on the same NIC could make more > > sense. > > > > It looks to me NICs are powerful now, but again as I mentioned I don't think it's > > a reason we need to deprecate rdma, especially if QEMU's rdma migration has > > the chance to be refactored using rsocket. > > > > Is there anyone who started looking into that direction? Would it make sense > > we start some PoC now? > > > > My team has finished the PoC refactoring which works well. > > Progress: > 1. Implement io/channel-rdma.c, > 2. Add unit test tests/unit/test-io-channel-rdma.c and verifying it is successful, > 3. Remove the original code from migration/rdma.c, > 4. Rewrite the rdma_start_outgoing_migration and rdma_start_incoming_migration logic, > 5. Remove all rdma_xxx functions from migration/ram.c. (to prevent RDMA live migration from polluting the core logic of live migration), > 6. The soft-RoCE implemented by software is used to test the RDMA live migration. It's successful. > > We will be submit the patchset later. > Thanks for working on this PoC, and sharing progress on this, we are looking forward for the patchset. > > Regards, > -Gonglei Regards! Jinpu > > > Thanks, > > > > -- > > Peter Xu >