Hi Michael, Hi Peter, On Thu, May 2, 2024 at 3:23 PM Michael Galaxy <mgalaxy@xxxxxxxxxx> wrote: > > Yu Zhang / Jinpu, > > Any possibility (at your lesiure, and within the disclosure rules of > your company, IONOS) if you could share any of your performance > information to educate the group? > > NICs have indeed changed, but not everybody has 100ge mellanox cards at > their disposal. Some people don't. Our staging env is with 100 Gb/s IB environment. We will have a new setup in the coming months with Ethernet (RoCE), we will run some performance comparison when we have the environment ready. > > - Michael Thx! Jinpu > > On 5/1/24 11:16, Peter Xu wrote: > > On Wed, May 01, 2024 at 04:59:38PM +0100, Daniel P. Berrangé wrote: > >> On Wed, May 01, 2024 at 11:31:13AM -0400, Peter Xu wrote: > >>> What I worry more is whether this is really what we want to keep rdma in > >>> qemu, and that's also why I was trying to request for some serious > >>> performance measurements comparing rdma v.s. nics. And here when I said > >>> "we" I mean both QEMU community and any company that will support keeping > >>> rdma around. > >>> > >>> The problem is if NICs now are fast enough to perform at least equally > >>> against rdma, and if it has a lower cost of overall maintenance, does it > >>> mean that rdma migration will only be used by whoever wants to keep them in > >>> the products and existed already? In that case we should simply ask new > >>> users to stick with tcp, and rdma users should only drop but not increase. > >>> > >>> It seems also destined that most new migration features will not support > >>> rdma: see how much we drop old features in migration now (which rdma > >>> _might_ still leverage, but maybe not), and how much we add mostly multifd > >>> relevant which will probably not apply to rdma at all. So in general what > >>> I am worrying is a both-loss condition, if the company might be easier to > >>> either stick with an old qemu (depending on whether other new features are > >>> requested to be used besides RDMA alone), or do periodic rebase with RDMA > >>> downstream only. > >> I don't know much about the originals of RDMA support in QEMU and why > >> this particular design was taken. It is indeed a huge maint burden to > >> have a completely different code flow for RDMA with 4000+ lines of > >> custom protocol signalling which is barely understandable. > >> > >> I would note that /usr/include/rdma/rsocket.h provides a higher level > >> API that is a 1-1 match of the normal kernel 'sockets' API. If we had > >> leveraged that, then QIOChannelSocket class and the QAPI SocketAddress > >> type could almost[1] trivially have supported RDMA. There would have > >> been almost no RDMA code required in the migration subsystem, and all > >> the modern features like compression, multifd, post-copy, etc would > >> "just work". > >> > >> I guess the 'rsocket.h' shim may well limit some of the possible > >> performance gains, but it might still have been a better tradeoff > >> to have not quite so good peak performance, but with massively > >> less maint burden. > > My understanding so far is RDMA is sololy for performance but nothing else, > > then it's a question on whether rdma existing users would like to do so if > > it will run slower. > > > > Jinpu mentioned on the explicit usages of ib verbs but I am just mostly > > quotting that word as I don't really know such details: > > > > https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/CAMGffEm2TWJxOPcNQTQ1Sjytf5395dBzTCMYiKRqfxDzJwSN6A@xxxxxxxxxxxxxx/__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOew9oW_kg$ > > > > So not sure whether that applies here too, in that having qiochannel > > wrapper may not allow direct access to those ib verbs. > > > > Thanks, > > > >> With regards, > >> Daniel > >> > >> [1] "almost" trivially, because the poll() integration for rsockets > >> requires a bit more magic sauce since rsockets FDs are not > >> really FDs from the kernel's POV. Still, QIOCHannel likely can > >> abstract that probme. > >> -- > >> |: https://urldefense.com/v3/__https://berrange.com__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOfyTmFFUQ$ -o- https://urldefense.com/v3/__https://www.flickr.com/photos/dberrange__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOf8A5OC0Q$ :| > >> |: https://urldefense.com/v3/__https://libvirt.org__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOf3gffAdg$ -o- https://urldefense.com/v3/__https://fstop138.berrange.com__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOfPMofYqw$ :| > >> |: https://urldefense.com/v3/__https://entangle-photo.org__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOeQ5jjAeQ$ -o- https://urldefense.com/v3/__https://www.instagram.com/dberrange__;!!GjvTz_vk!W6-HGWM-XkF_52am249DrLIDQeZctVOHg72LvOHGUcwxqQM5mY0GNYYl-yNJslN7A5GfLOfhaDF9WA$ :| > >> _______________________________________________ Devel mailing list -- devel@xxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx