Re: [PATCH] rdma: not display the rdma link in other net namespace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Oct 09, 2022 at 10:20:53AM +0000, yanjun.zhu@xxxxxxxxx wrote:
> September 28, 2022 2:04 PM, "Leon Romanovsky" <leon@xxxxxxxxxx> wrote:
> 
> > On Tue, Sep 27, 2022 at 06:58:50PM +0800, Yanjun Zhu wrote:
> > 
> >> 在 2022/9/27 18:34, Leon Romanovsky 写道:
> >> On Sun, Sep 25, 2022 at 10:40:33PM -0400, yanjun.zhu@xxxxxxxxx wrote:
> >>> From: Zhu Yanjun <yanjun.zhu@xxxxxxxxx>
> >>> 
> >>> When the net devices are moved to another net namespace, the command
> >>> "rdma link" should not dispaly the rdma link about this net device.
> >>> 
> >>> For example, when the net device eno12399 is moved to net namespace net0
> >>> from init_net, the rdma link of eno12399 should not display in init_net.
> >>> 
> >>> Before this change:
> >>> 
> >>> Init_net:
> >>> 
> >>> link roceo12399/1 state DOWN physical_state DISABLED <---should not display
> >>> link roceo12409/1 state DOWN physical_state DISABLED netdev eno12409
> >>> link rocep202s0f0/1 state DOWN physical_state DISABLED netdev ens7f0
> >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP netdev ens7f1
> >>> 
> >>> net0:
> >>> 
> >>> link roceo12399/1 state DOWN physical_state DISABLED netdev eno12399
> >>> link roceo12409/1 state DOWN physical_state DISABLED <---should not display
> >>> link rocep202s0f0/1 state DOWN physical_state DISABLED <---should not display
> >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP <---should not display
> >>> 
> >>> After this change
> >>> 
> >>> Init_net:
> >>> 
> >>> link roceo12409/1 state DOWN physical_state DISABLED netdev eno12409
> >>> link rocep202s0f0/1 state DOWN physical_state DISABLED netdev ens7f0
> >>> link rocep202s0f1/1 state ACTIVE physical_state LINK_UP netdev ens7f1
> >>> 
> >>> net0:
> >>> 
> >>> link roceo12399/1 state DOWN physical_state DISABLED netdev eno12399
> >>> 
> >>> Fixes: da990ab40a92 ("rdma: Add link object")
> >>> Signed-off-by: Zhu Yanjun <yanjun.zhu@xxxxxxxxx>
> >>> ---
> >>> rdma/link.c | 3 +++
> >>> 1 file changed, 3 insertions(+)
> >>> 
> >>> diff --git a/rdma/link.c b/rdma/link.c
> >>> index bf24b849..449a7636 100644
> >>> --- a/rdma/link.c
> >>> +++ b/rdma/link.c
> >>> @@ -238,6 +238,9 @@ static int link_parse_cb(const struct nlmsghdr *nlh, void *data)
> >>> return MNL_CB_ERROR;
> >>> }
> >>> + if (!tb[RDMA_NLDEV_ATTR_NDEV_NAME] || !tb[RDMA_NLDEV_ATTR_NDEV_INDEX])
> >>> + return MNL_CB_OK;
> >>> +
> >> Regarding your question where it should go in addition to RDMA, the answer
> >> is netdev ML. The rdmatool is part of iproute2 and the relevant maintainers
> >> should be CCed.
> >> Thanks. I will also send it to netdev ML and CC the maintainers.
> >> 
> >> Regarding the change, I don't think that it is right. User space tool is
> >> a simple viewer of data returned from the kernel. It is not a mistake to
> >> return device without netdev.
> >> 
> >> Normally a rdma link based on RoCEv2 should be with a NIC. This NIC device
> >> 
> >> will send/recv udp packets. With mellanox/intel NIC device, this net device
> >> also
> >> 
> >> do more work than sending/receiving packets.
> >> 
> >> From this perspective, a rdma link is dependent on a net device.
> >> 
> >> In this problem, net device is moved to another net namespace. So it can not
> >> be
> >> 
> >> obtained.  And this rdma link can also not work in this net namespace.
> >> 
> >> So this rdma link should not appear in this net namespace. Or else, it would
> >> confuse
> >> 
> >> the user.
> >> 
> >> In fact, net namespace is a concept in tcp/ip stack. And it does not exist
> >> in rdma stack.
> > 
> > RDMA has two different net namespace mode: shared and exclusive.
> > 
> > In shared mode, the IB devices are shared across all net namespaces and
> > "moving" net device into different namespace just "hides" it, but don't
> > disconnect.
> 
> Hi, Leon
> 
> About RDMA shared and exclusive mode, I am confusing about this scenario:
> 
> In shared mode, ib device A is in net namespace A1 while netdev device B is in net namespace B1.
> IB device A is dependent on netdev device B. How to make tests in the above scenario?
> Both rping and perftest need a IP address to work. But now ip address is in net namespace B1 while
> ib device A is in net namespace A1.
> 
> In the product environment, does the above scenario exist?

Yes and no at the same time.

Yes:
The whole net namespace support is needed for containers. In old
versions of rdma-core, libibverbs relied on /sys/class/infiniband/
structure. This is why we need "shared" mode, where IB exists without
relation to netdev.

No:
Like you said, it won't work for RoCE and iWARP.

Thanks

> 
> Thanks and Regards,
> Zhu Yanjun
> 
> > 
> > See comments around various usages of ib_devices_shared_netns variable.
> > 
> > Thanks



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux