On 2025/3/13 15:46, Wenjia Zhang wrote: > > > On 04.03.25 13:43, Guangguan Wang wrote: >> When using smc_pnet in SMC, it will only search the pnetid in the >> base_ndev of the netdev hierarchy(both HW PNETID and User-defined >> sw pnetid). This may not work for some scenarios when using SMC in >> container on cloud environment. >> In container, there have choices of different container network, >> such as directly using host network, virtual network IPVLAN, veth, >> etc. Different choices of container network have different netdev >> hierarchy. Examples of netdev hierarchy show below. (eth0 and eth1 >> in host below is the netdev directly related to the physical device). >> _______________________________ >> | _________________ | >> | |POD | | >> | | | | >> | | eth0_________ | | >> | |____| |__| | >> | | | | >> | | | | >> | eth1|base_ndev| eth0_______ | >> | | | | RDMA || >> | host |_________| |_______|| >> --------------------------------- >> netdev hierarchy if directly using host network >> ________________________________ >> | _________________ | >> | |POD __________ | | >> | | |upper_ndev| | | >> | |eth0|__________| | | >> | |_______|_________| | >> | |lower netdev | >> | __|______ | >> | eth1| | eth0_______ | >> | |base_ndev| | RDMA || >> | host |_________| |_______|| >> --------------------------------- >> netdev hierarchy if using IPVLAN >> _______________________________ >> | _____________________ | >> | |POD _________ | | >> | | |base_ndev|| | >> | |eth0(veth)|_________|| | >> | |____________|________| | >> | |pairs | >> | _______|_ | >> | | | eth0_______ | >> | veth|base_ndev| | RDMA || >> | |_________| |_______|| >> | _________ | >> | eth1|base_ndev| | >> | host |_________| | >> --------------------------------- >> netdev hierarchy if using veth >> Due to some reasons, the eth1 in host is not RDMA attached netdevice, >> pnetid is needed to map the eth1(in host) with RDMA device so that POD >> can do SMC-R. Because the eth1(in host) is managed by CNI plugin(such >> as Terway, network management plugin in container environment), and in >> cloud environment the eth(in host) can dynamically be inserted by CNI >> when POD create and dynamically be removed by CNI when POD destroy and >> no POD related to the eth(in host) anymore. It is hard to config the >> pnetid to the eth1(in host). But it is easy to config the pnetid to the >> netdevice which can be seen in POD. When do SMC-R, both the container >> directly using host network and the container using veth network can >> successfully match the RDMA device, because the configured pnetid netdev >> is a base_ndev. But the container using IPVLAN can not successfully >> match the RDMA device and 0x03030000 fallback happens, because the >> configured pnetid netdev is not a base_ndev. Additionally, if config >> pnetid to the eth1(in host) also can not work for matching RDMA device >> when using veth network and doing SMC-R in POD. >> >> To resolve the problems list above, this patch extends to search user >> -defined sw pnetid in the clc handshake ndev when no pnetid can be found >> in the base_ndev, and the base_ndev take precedence over ndev for backward >> compatibility. This patch also can unify the pnetid setup of different >> network choices list above in container(Config user-defined sw pnetid in >> the netdevice can be seen in POD). >> >> Signed-off-by: Guangguan Wang <guangguan.wang@xxxxxxxxxxxxxxxxx> >> --- >> net/smc/smc_pnet.c | 8 +++++--- >> 1 file changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c >> index 716808f374a8..b391c2ef463f 100644 >> --- a/net/smc/smc_pnet.c >> +++ b/net/smc/smc_pnet.c >> @@ -1079,14 +1079,16 @@ static void smc_pnet_find_roce_by_pnetid(struct net_device *ndev, >> struct smc_init_info *ini) >> { >> u8 ndev_pnetid[SMC_MAX_PNETID_LEN]; >> + struct net_device *base_ndev; >> struct net *net; >> - ndev = pnet_find_base_ndev(ndev); >> + base_ndev = pnet_find_base_ndev(ndev); >> net = dev_net(ndev); >> - if (smc_pnetid_by_dev_port(ndev->dev.parent, ndev->dev_port, >> + if (smc_pnetid_by_dev_port(base_ndev->dev.parent, base_ndev->dev_port, >> ndev_pnetid) && >> + smc_pnet_find_ndev_pnetid_by_table(base_ndev, ndev_pnetid) && >> smc_pnet_find_ndev_pnetid_by_table(ndev, ndev_pnetid)) { >> - smc_pnet_find_rdma_dev(ndev, ini); >> + smc_pnet_find_rdma_dev(base_ndev, ini); >> return; /* pnetid could not be determined */ >> } >> _smc_pnet_find_roce_by_pnetid(ndev_pnetid, ini, NULL, net); > > Hi Guangguan, > > sorry for the late answer! It looks good to me. Here is my R-b: > > Reviewed-by: Wenjia Zhang <wenjia@xxxxxxxxxxxxx> > Thanks, Wenjia. > Btw. could you give Halil some time for the review? He also wants to have a look. It is OK. Regards, Guangguan Wang > > Thanks, > Wenjia >