Yu Zhang / Jinpu,
Any possibility (at your lesiure, and within the disclosure rules of
your company, IONOS) if you could share any of your performance
information to educate the group?
NICs have indeed changed, but not everybody has 100ge mellanox cards at
their disposal. Some people don't.
- Michael
On 5/1/24 11:16, Peter Xu wrote:
On Wed, May 01, 2024 at 04:59:38PM +0100, Daniel P. Berrangé wrote:
On Wed, May 01, 2024 at 11:31:13AM -0400, Peter Xu wrote:
What I worry more is whether this is really what we want to keep rdma in
qemu, and that's also why I was trying to request for some serious
performance measurements comparing rdma v.s. nics. And here when I said
"we" I mean both QEMU community and any company that will support keeping
rdma around.
The problem is if NICs now are fast enough to perform at least equally
against rdma, and if it has a lower cost of overall maintenance, does it
mean that rdma migration will only be used by whoever wants to keep them in
the products and existed already? In that case we should simply ask new
users to stick with tcp, and rdma users should only drop but not increase.
It seems also destined that most new migration features will not support
rdma: see how much we drop old features in migration now (which rdma
_might_ still leverage, but maybe not), and how much we add mostly multifd
relevant which will probably not apply to rdma at all. So in general what
I am worrying is a both-loss condition, if the company might be easier to
either stick with an old qemu (depending on whether other new features are
requested to be used besides RDMA alone), or do periodic rebase with RDMA
downstream only.
I don't know much about the originals of RDMA support in QEMU and why
this particular design was taken. It is indeed a huge maint burden to
have a completely different code flow for RDMA with 4000+ lines of
custom protocol signalling which is barely understandable.
I would note that /usr/include/rdma/rsocket.h provides a higher level
API that is a 1-1 match of the normal kernel 'sockets' API. If we had
leveraged that, then QIOChannelSocket class and the QAPI SocketAddress
type could almost[1] trivially have supported RDMA. There would have
been almost no RDMA code required in the migration subsystem, and all
the modern features like compression, multifd, post-copy, etc would
"just work".
I guess the 'rsocket.h' shim may well limit some of the possible
performance gains, but it might still have been a better tradeoff
to have not quite so good peak performance, but with massively
less maint burden.
My understanding so far is RDMA is sololy for performance but nothing else,
then it's a question on whether rdma existing users would like to do so if
it will run slower.
Jinpu mentioned on the explicit usages of ib verbs but I am just mostly
quotting that word as I don't really know such details:
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/CAMGffEm2TWJxOPcNQTQ1Sjytf5395dBzTCMYiKRqfxDzJwSN6A@xxxxxxxxxxxxxx/__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOew9oW_kg$
So not sure whether that applies here too, in that having qiochannel
wrapper may not allow direct access to those ib verbs.
Thanks,
With regards,
Daniel
[1] "almost" trivially, because the poll() integration for rsockets
requires a bit more magic sauce since rsockets FDs are not
really FDs from the kernel's POV. Still, QIOCHannel likely can
abstract that probme.
--
|: https://urldefense.com/v3/__https://berrange.com__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOfyTmFFUQ$ -o- https://urldefense.com/v3/__https://www.flickr.com/photos/dberrange__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOf8A5OC0Q$ :|
|: https://urldefense.com/v3/__https://libvirt.org__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOf3gffAdg$ -o- https://urldefense.com/v3/__https://fstop138.berrange.com__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOfPMofYqw$ :|
|: https://urldefense.com/v3/__https://entangle-photo.org__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOeQ5jjAeQ$ -o- https://urldefense.com/v3/__https://www.instagram.com/dberrange__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOfhaDF9WA$ :|
_______________________________________________
Devel mailing list -- devel@xxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx