On 5/18/22 10:46 PM, Jason Gunthorpe wrote:
On Wed, May 18, 2022 at 04:30:33PM +0800, Cheng Xu wrote:
On 5/10/22 9:17 PM, Jason Gunthorpe wrote:
On Thu, Apr 21, 2022 at 03:17:45PM +0800, Cheng Xu wrote:
+static struct rdma_link_ops erdma_link_ops = {
+ .type = "erdma",
+ .newlink = erdma_newlink,
+};
Why is there still a newlink?
Hello, Jason,
About this issue, I have another idea, more simple and reasonable.
Maybe erdma driver doesn't need to link to a net device in kernel. In
the core code, the ib_device_get_netdev has several use cases:
1). query port info in netlink
2). get eth speed for IB (ib_get_eth_speed)
3). enumerate all RoCE ports (ib_enum_roce_netdev)
4). iw_query_port
The cases related to erdma is 4). But we change it in our patch 02/12.
So, it seems all right that we do not link erdma to a net device.
* I also test this solution, it works for both perftest and NoF. *
Another issue is how to get the port state and attributes without
net device. For this, erdma can get it from HW directly.
So, I think this may be the final solution. (BTW, I have gone over
the rdma drivers, EFA does in this way, it also has two separated
devices for net and rdma. It inspired me).
I'm not sure this works for an iWarp device - various things expect to
know the netdevice to know how to relate IP addresses to the iWarp
stuff - but then I don't really know iWarp.
As far as I know, iWarp device only has one GID entry which generated
from MAC address.
For iWarp, The CM part in core code resolves address, finds
route with the help of kernel's net subsystem, and then obtains the
correct ibdev by GID matching. The GID matching in iWarp is indeed MAC
address matching.
In another words, for iWarp devices, the core code doesn't handle IP
addressing related stuff directly, it is finished by calling net APIs.
The netdev set by ib_device_set_netdev does not used in iWarp's CM
process.
The binded netdev in iWarp devices, mainly have two purposes:
1). generated GID0, using the netdev's mac address.
2). get the port state and attributes.
For 1), erdma device binded to net device also by mac address, which can
be obtained from our PCIe bar registers.
For 2), erdma can also get the information, and may be more accurately.
For example, erdma can have different MTU with virtio-net in our cloud.
For RoCEv2, I know that it has many GIDs, some of them are generated
from IP addresses, and handing IP addressing in core code.
If it works at all it is not a great idea
As I explained above (I hope I explained clearly, but I'm a little worry
about my English :) ), netdev binded by ib_device_set_netdev in iWarp
device has limit usage. Compared with the v8 (traverse the netdevs in
kernel and find the matched net device), I think don't set netdev is
better.
After I explained, how do you think about this? If it still not ok,
I will work on the v8 in the future. Otherwise, I will send a v9 patch.
Thanks,
Cheng Xu