RE: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Peter

RDMA features high bandwidth, low latency (in non-blocking lossless network), and direct remote 
memory access by bypassing the CPU (As you know, CPU resources are expensive for cloud vendors, 
which is one of the reasons why we introduced offload cards.), which TCP does not have. 

In some scenarios where fast live migration is needed (extremely short interruption duration and migration 
duration) is very useful. To this end, we have also developed RDMA support for multifd.

Regards,
-Gonglei

> -----Original Message-----
> From: Peter Xu [mailto:peterx@xxxxxxxxxx]
> Sent: Wednesday, May 1, 2024 11:31 PM
> To: Daniel P. Berrangé <berrange@xxxxxxxxxx>
> Cc: Markus Armbruster <armbru@xxxxxxxxxx>; Michael Galaxy
> <mgalaxy@xxxxxxxxxx>; Yu Zhang <yu.zhang@xxxxxxxxx>; Zhijian Li (Fujitsu)
> <lizhijian@xxxxxxxxxxx>; Jinpu Wang <jinpu.wang@xxxxxxxxx>; Elmar Gerdes
> <elmar.gerdes@xxxxxxxxx>; qemu-devel@xxxxxxxxxx; Yuval Shaia
> <yuval.shaia.ml@xxxxxxxxx>; Kevin Wolf <kwolf@xxxxxxxxxx>; Prasanna
> Kumar Kalever <prasanna.kalever@xxxxxxxxxx>; Cornelia Huck
> <cohuck@xxxxxxxxxx>; Michael Roth <michael.roth@xxxxxxx>; Prasanna
> Kumar Kalever <prasanna4324@xxxxxxxxx>; integration@xxxxxxxxxxx; Paolo
> Bonzini <pbonzini@xxxxxxxxxx>; qemu-block@xxxxxxxxxx;
> devel@xxxxxxxxxxxxxxxxx; Hanna Reitz <hreitz@xxxxxxxxxx>; Michael S. Tsirkin
> <mst@xxxxxxxxxx>; Thomas Huth <thuth@xxxxxxxxxx>; Eric Blake
> <eblake@xxxxxxxxxx>; Song Gao <gaosong@xxxxxxxxxxx>; Marc-André
> Lureau <marcandre.lureau@xxxxxxxxxx>; Alex Bennée
> <alex.bennee@xxxxxxxxxx>; Wainer dos Santos Moschetta
> <wainersm@xxxxxxxxxx>; Beraldo Leal <bleal@xxxxxxxxxx>; Gonglei (Arei)
> <arei.gonglei@xxxxxxxxxx>; Pannengyuan <pannengyuan@xxxxxxxxxx>
> Subject: Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling
> 
> On Tue, Apr 30, 2024 at 09:00:49AM +0100, Daniel P. Berrangé wrote:
> > On Tue, Apr 30, 2024 at 09:15:03AM +0200, Markus Armbruster wrote:
> > > Peter Xu <peterx@xxxxxxxxxx> writes:
> > >
> > > > On Mon, Apr 29, 2024 at 08:08:10AM -0500, Michael Galaxy wrote:
> > > >> Hi All (and Peter),
> > > >
> > > > Hi, Michael,
> > > >
> > > >>
> > > >> My name is Michael Galaxy (formerly Hines). Yes, I changed my
> > > >> last name (highly irregular for a male) and yes, that's my real last name:
> > > >> https://www.linkedin.com/in/mrgalaxy/)
> > > >>
> > > >> I'm the original author of the RDMA implementation. I've been
> > > >> discussing with Yu Zhang for a little bit about potentially
> > > >> handing over maintainership of the codebase to his team.
> > > >>
> > > >> I simply have zero access to RoCE or Infiniband hardware at all,
> > > >> unfortunately. so I've never been able to run tests or use what I
> > > >> wrote at work, and as all of you know, if you don't have a way to
> > > >> test something, then you can't maintain it.
> > > >>
> > > >> Yu Zhang put a (very kind) proposal forward to me to ask the
> > > >> community if they feel comfortable training his team to maintain
> > > >> the codebase (and run
> > > >> tests) while they learn about it.
> > > >
> > > > The "while learning" part is fine at least to me.  IMHO the
> > > > "ownership" to the code, or say, taking over the responsibility,
> > > > may or may not need 100% mastering the code base first.  There
> > > > should still be some fundamental confidence to work on the code
> > > > though as a starting point, then it's about serious use case to
> > > > back this up, and careful testings while getting more familiar with it.
> > >
> > > How much experience we expect of maintainers depends on the
> > > subsystem and other circumstances.  The hard requirement isn't
> > > experience, it's trust.  See the recent attack on xz.
> > >
> > > I do not mean to express any doubts whatsoever on Yu Zhang's integrity!
> > > I'm merely reminding y'all what's at stake.
> >
> > I think we shouldn't overly obsess[1] about 'xz', because the
> > overwhealmingly common scenario is that volunteer maintainers are
> > honest people. QEMU is in a massively better peer review situation.
> > With xz there was basically no oversight of the new maintainer. With
> > QEMU, we have oversight from 1000's of people on the list, a huge pool
> > of general maintainers, the specific migration maintainers, and the release
> manager merging code.
> >
> > With a lack of historical experiance with QEMU maintainership, I'd
> > suggest that new RDMA volunteers would start by adding themselves to the
> "MAINTAINERS"
> > file with only the 'Reviewer' classification. The main migration
> > maintainers would still handle pull requests, but wait for a R-b from
> > one of the RMDA volunteers. After some period of time the RDMA folks
> > could graduate to full maintainer status if the migration maintainers needed
> to reduce their load.
> > I suspect that might prove unneccesary though, given RDMA isn't an
> > area of code with a high turnover of patches.
> 
> Right, and we can do that as a start, it also follows our normal rules of starting
> from Reviewers to maintain something.  I even considered Zhijian to be the
> previous rdma goto guy / maintainer no matter what role he used to have in
> the MAINTAINERS file.
> 
> Here IMHO it's more about whether any company would like to stand up and
> provide help, without yet binding that to be able to send pull requests in the
> near future or even longer term.
> 
> What I worry more is whether this is really what we want to keep rdma in
> qemu, and that's also why I was trying to request for some serious
> performance measurements comparing rdma v.s. nics.  And here when I said
> "we" I mean both QEMU community and any company that will support
> keeping rdma around.
> 
> The problem is if NICs now are fast enough to perform at least equally against
> rdma, and if it has a lower cost of overall maintenance, does it mean that rdma
> migration will only be used by whoever wants to keep them in the products and
> existed already?  In that case we should simply ask new users to stick with tcp,
> and rdma users should only drop but not increase.
> 
> It seems also destined that most new migration features will not support
> rdma: see how much we drop old features in migration now (which rdma
> _might_ still leverage, but maybe not), and how much we add mostly multifd
> relevant which will probably not apply to rdma at all.  So in general what I am
> worrying is a both-loss condition, if the company might be easier to either stick
> with an old qemu (depending on whether other new features are requested to
> be used besides RDMA alone), or do periodic rebase with RDMA downstream
> only.
> 
> So even if we want to keep RDMA around I hope with this chance we can at
> least have clear picture on when we should still suggest any new user to use
> RDMA (with the reasons behind).  Or we simply shouldn't suggest any new
> user to use RDMA at all (because at least it'll lose many new migration
> features).
> 
> Thanks,
> 
> --
> Peter Xu

_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux