在 2024/10/8 2:15, Leon Romanovsky 写道:
On Mon, Oct 07, 2024 at 08:45:07AM -0500, Michael Galaxy wrote:
Hi,
On 10/7/24 03:47, Yu Zhang wrote:
!-------------------------------------------------------------------|
This Message Is From an External Sender
This message came from outside your organization.
|-------------------------------------------------------------------!
Sure, as we talked at the KVM Forum, a possible approach is to set up
two VMs on a physical host, configure the SoftRoCE, and run the
migration test in two nested VMs to ensure that the migration data
traffic goes through the emulated RDMA hardware. I will continue with
this and let you know.
Acknowledged. Do share if you have any problems with it, like if it has
compatibility issues
or if we need a different solution. We're open to change.
I'm not familiar with the "current state" of this or how well it would even
work.
Any compatibility issue between versions of RXE (SoftRoCE) or between
RXE and real devices is a bug in RXE, which should be fixed.
RXE is expected to be compatible with rest RoCE devices, both virtual
and physical.
From my tests, about physical RoCE devices, for example, Nvidia MLX5
and intel E810 (iRDMA), if RDMA feature is disabled on those devices.
RXE can work well with them.
About Virtual devices, most virtual devices can work well with RXE, for
example,bonding, veth. I have done a lot of tests with them.
If some virtual devices can not work well with RXE, please share the
error messages in RDMA maillist.
Zhu Yanjun
Thanks
- Michael
On Fri, Oct 4, 2024 at 4:06 PM Michael Galaxy <mgalaxy@xxxxxxxxxx> wrote:
On 10/3/24 16:43, Peter Xu wrote:
!-------------------------------------------------------------------|
This Message Is From an External Sender
This message came from outside your organization.
|-------------------------------------------------------------------!
On Thu, Oct 03, 2024 at 04:26:27PM -0500, Michael Galaxy wrote:
What about the testing solution that I mentioned?
Does that satisfy your concerns? Or is there still a gap here that needs to
be met?
I think such testing framework would be helpful, especially if we can kick
it off in CI when preparing pull requests, then we can make sure nothing
will break RDMA easily.
Meanwhile, we still need people committed to this and actively maintain it,
who knows the rdma code well.
Thanks,
OK, so comments from Yu Zhang and Gonglei? Can we work up a CI test
along these lines that would ensure that future RDMA breakages are
detected more easily?
What do you think?
- Michael