On Thu, 2023-10-12 at 20:17 +0800, Dust Li wrote: > On Wed, Oct 11, 2023 at 10:48:16PM +0800, Dust Li wrote: > > On Thu, Sep 28, 2023 at 05:04:21PM +0200, Niklas Schnelle wrote: > > > On Mon, 2023-09-25 at 10:35 +0800, Albert Huang wrote: > > > > If the netdevice is within a container and communicates externally > > > > through network technologies like VXLAN, we won't be able to find > > > > routing information in the init_net namespace. To address this issue, > > > > we need to add a struct net parameter to the smc_ib_find_route function. > > > > This allow us to locate the routing information within the corresponding > > > > net namespace, ensuring the correct completion of the SMC CLC interaction. > > > > > > > > Signed-off-by: Albert Huang <huangjie.albert@xxxxxxxxxxxxx> > > > > --- > > > > net/smc/af_smc.c | 3 ++- > > > > net/smc/smc_ib.c | 7 ++++--- > > > > net/smc/smc_ib.h | 2 +- > > > > 3 files changed, 7 insertions(+), 5 deletions(-) > > > > > > > > > > I'm trying to test this patch on s390x but I'm running into the same > > > issue I ran into with the original SMC namespace > > > support:https://lore.kernel.org/netdev/8701fa4557026983a9ec687cfdd7ac5b3b85fd39.camel@xxxxxxxxxxxxx/ > > > > > > Just like back then I'm using a server and a client network namespace > > > on the same system with two ConnectX-4 VFs from the same card and port. > > > Both TCP/IP traffic as well as user-space RDMA via "qperf … rc_bw" and > > > `qperf … rc_lat` work between namespaces and definitely go via the > > > card. > > > > > > I did use "rdma system set netns exclusive" then moved the RDMA devices > > > into the namespaces with "rdma dev set <rdma_dev> netns <namespace>". I > > > also verified with "ip netns exec <namespace> rdma dev" > > > that the RDMA devices are in the network namespace and as seen by the > > > qperf runs normal RDMA does work. > > > > > > For reference the smc_chck tool gives me the following output: > > > > > > Server started on port 37373 > > > [DEBUG] Interfaces to check: eno4378 > > > Test with target IP 10.10.93.12 and port 37373 > > > Live test (SMC-D and SMC-R) > > > [DEBUG] Running client: smc_run /tmp/echo-clt.x0q8iO 10.10.93.12 -p > > > 37373 > > > [DEBUG] Client result: TCP 0x05000000/0x03030000 > > > Failed (TCP fallback), reasons: > > > Client: 0x05000000 Peer declined during handshake > > > Server: 0x03030000 No SMC devices found (R and D) > > > > > > I also checked that SMC is generally working, once I add an ISM device > > > I do get SMC-D between the namespaces. Any ideas what could break SMC-R > > > here? > > > > I missed the email :( > > > > Are you running SMC-Rv2 or v1 ? > > Hi Niklas, > > I tried your test today, and I encounter the same issue. > But I found it's because my 2 VFs are in difference subnets, > SMC-Rv2 work fine, SMC-Rv1 won't work, which is expected. > When I set the 2 VFs in the same subnet, SMC-Rv1 also works. > > So I'm not sure it's the same for you. Can you check it out ? > > BTW, the fallback reason(SMC_CLC_DECL_NOSMCDEV) in this case > is really not friendly, it's better to return SMC_CLC_DECL_DIFFPREFIX. > > Best regards, > Dust I think you are right. I did use two consecutive private IPs but I had set the subnet mask to /32. Setting that to /16 the SMC-R connection is established. I'll work with Wenjia and Jan on why my system is defaulting to SMC-Rv1 I would have hoped to get SMC-Rv2. Thanks for your insights! Niklas