Re: [PATCH for-next v7 10/12] RDMA/erdma: Add the erdma module

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/19/2022 12:20 PM, Bernard Metzler wrote:


-----Original Message-----
From: Jason Gunthorpe <jgg@xxxxxxxxxx>
Sent: Wednesday, 18 May 2022 18:32
To: Cheng Xu <chengyou@xxxxxxxxxxxxxxxxx>; Bernard Metzler
<BMT@xxxxxxxxxxxxxx>; Tom Talpey <tom@xxxxxxxxxx>
Cc: dledford@xxxxxxxxxx; leon@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx;
KaiShen@xxxxxxxxxxxxxxxxx; tonylu@xxxxxxxxxxxxxxxxx
Subject: [EXTERNAL] Re: [PATCH for-next v7 10/12] RDMA/erdma: Add the erdma
module

On Thu, May 19, 2022 at 12:24:22AM +0800, Cheng Xu wrote:


On 5/18/22 10:46 PM, Jason Gunthorpe wrote:
On Wed, May 18, 2022 at 04:30:33PM +0800, Cheng Xu wrote:


On 5/10/22 9:17 PM, Jason Gunthorpe wrote:
On Thu, Apr 21, 2022 at 03:17:45PM +0800, Cheng Xu wrote:

+static struct rdma_link_ops erdma_link_ops = {
+	.type = "erdma",
+	.newlink = erdma_newlink,
+};

Why is there still a newlink?


Hello, Jason,

About this issue, I have another idea, more simple and reasonable.

Maybe erdma driver doesn't need to link to a net device in kernel. In
the core code, the ib_device_get_netdev has several use cases:

    1). query port info in netlink
    2). get eth speed for IB (ib_get_eth_speed)
    3). enumerate all RoCE ports (ib_enum_roce_netdev)
    4). iw_query_port

The cases related to erdma is 4). But we change it in our patch
02/12.
So, it seems all right that we do not link erdma to a net device.

* I also test this solution, it works for both perftest and NoF. *

Another issue is how to get the port state and attributes without
net device. For this, erdma can get it from HW directly.

So, I think this may be the final solution. (BTW, I have gone over
the rdma drivers, EFA does in this way, it also has two separated
devices for net and rdma. It inspired me).

I'm not sure this works for an iWarp device - various things expect to
know the netdevice to know how to relate IP addresses to the iWarp
stuff - but then I don't really know iWarp.

As far as I know, iWarp device only has one GID entry which generated
from MAC address.

For iWarp, The CM part in core code resolves address, finds
route with the help of kernel's net subsystem, and then obtains the
correct
ibdev by GID matching. The GID matching in iWarp is indeed MAC address
matching.

In another words, for iWarp devices, the core code doesn't handle IP
addressing related stuff directly, it is finished by calling net APIs.
The netdev set by ib_device_set_netdev does not used in iWarp's CM
process.

The binded netdev in iWarp devices, mainly have two purposes:
   1). generated GID0, using the netdev's mac address.
   2). get the port state and attributes.

For 1), erdma device binded to net device also by mac address, which can
be obtained from our PCIe bar registers.
For 2), erdma can also get the information, and may be more accurately.
For example, erdma can have different MTU with virtio-net in our cloud.

For RoCEv2, I know that it has many GIDs, some of them are generated
from IP addresses, and handing IP addressing in core code.

Bernard, Tom what do you think?

Jason

I think iWarp (and now RoCEv2 with its UDP dependency) drivers
produce GIDs mostly to satisfy the current RDMA CM infrastructure,
which depends on this type of unique identifier, inherited from IB.
Imo, more natural would be to implement IP based RDMA protocols
connection management by relying on IP addresses.

Agreed. Exposing MAC addresses for this seems so... 20th century.

Tom.

Sorry for asking again - why erdma does not need to link with netdev?
Can erdma exist without using a netdev?


Thanks,
Bernard



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux