Re: some questions about restrictions in SMC-R v2's implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/5/10 17:40, Wenjia Zhang wrote:


On 07.05.24 07:54, Guangguan Wang wrote:
Hi, Wenjia and Jan,

When testing SMC-R v2, I found some scenarios where SMC-R v2 should be worked, but due to some restrictions in SMC-R v2's implementation,
fallback happened. I want to know why these restrictions exist and what would happen if these restrictions were removed.

The first is in the function smc_ib_determine_gid_rcu, where restricts the subnet matching between smcrv2->saddr and the RDMA related netdev.
codes here:
static int smc_ib_determine_gid_rcu(...)
{
     ...
         in_dev_for_each_ifa_rcu(ifa, in_dev) {
             if (!inet_ifa_match(smcrv2->saddr, ifa))
                 continue;
             subnet_match = true;
             break;
         }
         if (!subnet_match)
             goto out;
     ...
out:
     return -ENODEV;
}
In my testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth0 in netnamespace2. For the sake of clarity in the following text, we will refer to eth0 in netnamespace1 as eth1, and eth0 in netnamespace2 as eth2. The eth1's ip is 192.168.0.3/32 and the eth2's ip is 192.168.0.4/24. The netmask of eth1 must be 32 due to some reasons. The eth1 is a RDMA related netdev, which means the adaptor of eth1 has RDMA function. The eth2 has been associated to the eth1's RDMA device using smc_pnet. When testing connection in netnamespace2(using eth2 for SMC-R connection), we got fallback connection, rsn is 0x03010000, due to the above subnet matching restriction. But in this scenario, I think
SMC-R should work.
In my another testing environment, either server or client, exists two netdevs, eth0 in netnamespace1 and eth1 in netnamespace1. The eth0's ip is 192.168.0.3/24 and the eth1's ip is 192.168.1.4/24. The eth0 is a RDMA related netdev, which means the adaptor of eth0 has RDMA function. The eth1 has been associated to the eth0's RDMA device using smc_pnet. When testing SMC-R connection through eth1, we got fallback connection, rsn is 0x03010000, due to the above subnet matching restriction. In my environment, eth0 and eth1 have the same network connectivity even though they have different
subnet. I think SMC-R should work in this scenario.

The other is in the function smc_connect_rdma_v2_prepare, where restricts the symmetric configuration of routing between client and server. codes here:
static int smc_connect_rdma_v2_prepare(...)
{
     ...
     if (fce->v2_direct) {
         memcpy(ini->smcrv2.nexthop_mac, &aclc->r0.lcl.mac, ETH_ALEN);
         ini->smcrv2.uses_gateway = false;
     } else {
         if (smc_ib_find_route(net, smc->clcsock->sk->sk_rcv_saddr,
               smc_ib_gid_to_ipv4(aclc->r0.lcl.gid),
               ini->smcrv2.nexthop_mac,
               &ini->smcrv2.uses_gateway))
             return SMC_CLC_DECL_NOROUTE;
         if (!ini->smcrv2.uses_gateway) {
             /* mismatch: peer claims indirect, but its direct */
             return SMC_CLC_DECL_NOINDIRECT;
         }
     }
     ...
}
In my testing environment, server's ip is 192.168.0.3/24, client's ip 192.168.0.4/24, regarding how many netdev in server or client. Server has special route setting due to some other reasons, which results in indirect route from 192.168.0.3/24 to 192.168.0.4/24. Thus, when CLC handshake, client will get fce->v2_direct==false, but client has no special routing setting and will find direct route from 192.168.0.4/24 to 192.168.0.3/24. Due to the above symmetric configuration of routing restriction, we got fallback connection, rsn is 0x030f0000. But I think SMC-R should work in this scenario.
And more, why check the symmetric configuration of routing only when server is indirect route?

Waiting for your reply.

Thanks,
Guangguan Wang

Hi Guangguan,

Thank you for the questions. We also asked ourselves the same questions a while ago, and also did some research on it. Unfortunately, it was not yet done and I had to delay it because of my vacation last month. Now it's time to pick it up again ;) I'll come back to you as soon as I can give a very certain answer.

Thanks,
Wenjia

Hi, Wenjia.

Following Guangguan's questions, I noticed that in SMCv2, ini->smcrv2.saddr stores clcsock->sk->sk_rcv_saddr
and ini->smcrv2.daddr stores the IP converted from peer RNIC's gid (smc_ib_gid_to_ipv4(smc_v2_ext->roce)),
e.g. in smc_find_rdma_v2_device_serv(). And this is also how src address and dst address are considered in many
other places, such as in smc_ib_find_route() mentioned above. I am confused why such 'asymmetrical' usage?

   * clc src addr <----> clc dst addr
   local RNIC gid <----> * peer RNIC gid          (*) means used for saddr or daddr

I guess there might be some reason behind this and I'd really appreciate if you have a answer.

Thank you!




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux