On Sun, Oct 09, 2022 at 10:20:53AM +0000, yanjun.zhu@xxxxxxxxx wrote: > September 28, 2022 2:04 PM, "Leon Romanovsky" <leon@xxxxxxxxxx> wrote: > > > On Tue, Sep 27, 2022 at 06:58:50PM +0800, Yanjun Zhu wrote: > > > >> 在 2022/9/27 18:34, Leon Romanovsky 写道: > >> On Sun, Sep 25, 2022 at 10:40:33PM -0400, yanjun.zhu@xxxxxxxxx wrote: > >>> From: Zhu Yanjun <yanjun.zhu@xxxxxxxxx> > >>> > >>> When the net devices are moved to another net namespace, the command > >>> "rdma link" should not dispaly the rdma link about this net device. > >>> > >>> For example, when the net device eno12399 is moved to net namespace net0 > >>> from init_net, the rdma link of eno12399 should not display in init_net. > >>> > >>> Before this change: > >>> > >>> Init_net: > >>> > >>> link roceo12399/1 state DOWN physical_state DISABLED <---should not display > >>> link roceo12409/1 state DOWN physical_state DISABLED netdev eno12409 > >>> link rocep202s0f0/1 state DOWN physical_state DISABLED netdev ens7f0 > >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP netdev ens7f1 > >>> > >>> net0: > >>> > >>> link roceo12399/1 state DOWN physical_state DISABLED netdev eno12399 > >>> link roceo12409/1 state DOWN physical_state DISABLED <---should not display > >>> link rocep202s0f0/1 state DOWN physical_state DISABLED <---should not display > >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP <---should not display > >>> > >>> After this change > >>> > >>> Init_net: > >>> > >>> link roceo12409/1 state DOWN physical_state DISABLED netdev eno12409 > >>> link rocep202s0f0/1 state DOWN physical_state DISABLED netdev ens7f0 > >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP netdev ens7f1 > >>> > >>> net0: > >>> > >>> link roceo12399/1 state DOWN physical_state DISABLED netdev eno12399 > >>> > >>> Fixes: da990ab40a92 ("rdma: Add link object") > >>> Signed-off-by: Zhu Yanjun <yanjun.zhu@xxxxxxxxx> > >>> --- > >>> rdma/link.c | 3 +++ > >>> 1 file changed, 3 insertions(+) > >>> > >>> diff --git a/rdma/link.c b/rdma/link.c > >>> index bf24b849..449a7636 100644 > >>> --- a/rdma/link.c > >>> +++ b/rdma/link.c > >>> @@ -238,6 +238,9 @@ static int link_parse_cb(const struct nlmsghdr *nlh, void *data) > >>> return MNL_CB_ERROR; > >>> } > >>> + if (!tb[RDMA_NLDEV_ATTR_NDEV_NAME] || !tb[RDMA_NLDEV_ATTR_NDEV_INDEX]) > >>> + return MNL_CB_OK; > >>> + > >> Regarding your question where it should go in addition to RDMA, the answer > >> is netdev ML. The rdmatool is part of iproute2 and the relevant maintainers > >> should be CCed. > >> Thanks. I will also send it to netdev ML and CC the maintainers. > >> > >> Regarding the change, I don't think that it is right. User space tool is > >> a simple viewer of data returned from the kernel. It is not a mistake to > >> return device without netdev. > >> > >> Normally a rdma link based on RoCEv2 should be with a NIC. This NIC device > >> > >> will send/recv udp packets. With mellanox/intel NIC device, this net device > >> also > >> > >> do more work than sending/receiving packets. > >> > >> From this perspective, a rdma link is dependent on a net device. > >> > >> In this problem, net device is moved to another net namespace. So it can not > >> be > >> > >> obtained. And this rdma link can also not work in this net namespace. > >> > >> So this rdma link should not appear in this net namespace. Or else, it would > >> confuse > >> > >> the user. > >> > >> In fact, net namespace is a concept in tcp/ip stack. And it does not exist > >> in rdma stack. > > > > RDMA has two different net namespace mode: shared and exclusive. > > > > In shared mode, the IB devices are shared across all net namespaces and > > "moving" net device into different namespace just "hides" it, but don't > > disconnect. > > Hi, Leon > > About RDMA shared and exclusive mode, I am confusing about this scenario: > > In shared mode, ib device A is in net namespace A1 while netdev device B is in net namespace B1. > IB device A is dependent on netdev device B. How to make tests in the above scenario? > Both rping and perftest need a IP address to work. But now ip address is in net namespace B1 while > ib device A is in net namespace A1. > > In the product environment, does the above scenario exist? Yes and no at the same time. Yes: The whole net namespace support is needed for containers. In old versions of rdma-core, libibverbs relied on /sys/class/infiniband/ structure. This is why we need "shared" mode, where IB exists without relation to netdev. No: Like you said, it won't work for RoCE and iWARP. Thanks > > Thanks and Regards, > Zhu Yanjun > > > > > See comments around various usages of ib_devices_shared_netns variable. > > > > Thanks