On 5/20/22 12:20 AM, Bernard Metzler wrote:
<...>
As far as I know, iWarp device only has one GID entry which generated
from MAC address.
For iWarp, The CM part in core code resolves address, finds
route with the help of kernel's net subsystem, and then obtains the
correct
ibdev by GID matching. The GID matching in iWarp is indeed MAC address
matching.
In another words, for iWarp devices, the core code doesn't handle IP
addressing related stuff directly, it is finished by calling net APIs.
The netdev set by ib_device_set_netdev does not used in iWarp's CM
process.
The binded netdev in iWarp devices, mainly have two purposes:
1). generated GID0, using the netdev's mac address.
2). get the port state and attributes.
For 1), erdma device binded to net device also by mac address, which can
be obtained from our PCIe bar registers.
For 2), erdma can also get the information, and may be more accurately.
For example, erdma can have different MTU with virtio-net in our cloud.
For RoCEv2, I know that it has many GIDs, some of them are generated
from IP addresses, and handing IP addressing in core code.
Bernard, Tom what do you think?
Jason
I think iWarp (and now RoCEv2 with its UDP dependency) drivers
produce GIDs mostly to satisfy the current RDMA CM infrastructure,
which depends on this type of unique identifier, inherited from IB.
Imo, more natural would be to implement IP based RDMA protocols
connection management by relying on IP addresses.
Sorry for asking again - why erdma does not need to link with netdev?
Can erdma exist without using a netdev?
Actually erdma also need a net device binded to, and so does it.
These days I’m trying to find out acceptable ways to get the reference
of the binded netdev, e,g, the 'struct net_device' pointer. Unlike other
RDMA drivers can get the reference of their binded netdevs' reference
easily (most RDMA devices are based on the extended aux devices), it is
a little more complex for erdma, because erdma and its binded net device
are two separated PCIe devices.
Then I find that the netdev reference hold in ibdev is rarely used
in core code for iWarp deivces, GID0 is the key attribute (As you and
Tom mentioned, it appears with the historical need for compatibility,
but I think this is another story).
So, there are two choices for erdma: enum net devices and find the
matched one, or never calling ib_device_set_netdev. The second one has
less code.
The second way can't work in ROCE. But it works for iWarp (I've tested),
since the netdev reference is rarely used for iWarp in core code, as I
said in last reply.
In short, the question discussed here is that: is it acceptable that
doesn't hold the netdev reference in core code for a iWarp driver
(indeed it has a netdev binded to) ? Or is it necessary that calling
ib_device_set_netdev to set the binded netdev for iWarp driver?
You and Tom both are specialists in iWarp, your opinions are important.
Thanks very much
Cheng Xu
Thanks,
Bernard.