Re: [PATCH] ksmbd: fix missing RDMA-capable flag for IPoIB device in ksmbd_rdma_capable_netdev()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/19/2023 6:37 PM, Namjae Jeon wrote:
2023-10-19 15:01 GMT+09:00, Kangjing (Chaser) Huang <huangkangjing@xxxxxxxxx>:
Thank you for the review. Also thanks a lot to Namjae for submitting
this patch on my behalf.


On Tue, Oct 17, 2023 at 10:07 AM Tom Talpey <tom@xxxxxxxxxx> wrote:

On 10/15/2023 10:45 AM, Namjae Jeon wrote:
From: Kangjing Huang <huangkangjing@xxxxxxxxx>

Physical ib_device does not have an underlying net_device, thus its
association with IPoIB net_device cannot be retrieved via
ops.get_netdev() or ib_device_get_by_netdev(). ksmbd reads physical
ib_device port GUID from the lower 16 bytes of the hardware addresses
on
IPoIB net_device and match its underlying ib_device using ib_find_gid()

Signed-off-by: Kangjing Huang <huangkangjing@xxxxxxxxx>
Acked-by: Namjae Jeon <linkinjeon@xxxxxxxxxx>
---
   fs/smb/server/transport_rdma.c | 39
+++++++++++++++++++++++++---------
   1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/fs/smb/server/transport_rdma.c
b/fs/smb/server/transport_rdma.c
index 3b269e1f523a..a82131f7dd83 100644
--- a/fs/smb/server/transport_rdma.c
+++ b/fs/smb/server/transport_rdma.c
@@ -2140,8 +2140,7 @@ static int smb_direct_ib_client_add(struct
ib_device *ib_dev)
       if (ib_dev->node_type != RDMA_NODE_IB_CA)
               smb_direct_port = SMB_DIRECT_PORT_IWARP;

-     if (!ib_dev->ops.get_netdev ||
-         !rdma_frwr_is_supported(&ib_dev->attrs))
+     if (!rdma_frwr_is_supported(&ib_dev->attrs))
               return 0;

       smb_dev = kzalloc(sizeof(*smb_dev), GFP_KERNEL);
@@ -2241,17 +2240,37 @@ bool ksmbd_rdma_capable_netdev(struct net_device
*netdev)
               for (i = 0; i < smb_dev->ib_dev->phys_port_cnt; i++) {
                       struct net_device *ndev;

-                     ndev =
smb_dev->ib_dev->ops.get_netdev(smb_dev->ib_dev,
-                                                            i + 1);
-                     if (!ndev)
-                             continue;
+                     /* RoCE and iWRAP ib_dev is backed by a netdev */
+                     if (smb_dev->ib_dev->ops.get_netdev) {

The "IWRAP" is a typo, but IMO the comment is misleading. This is simply
looking up the target netdev, it's not limited to these two rdma types.
I suggest deleting the comment.


I agree with this point and will update this comment in version 2 of the
patch.

+                             ndev = smb_dev->ib_dev->ops.get_netdev(
+                                     smb_dev->ib_dev, i + 1);
+                             if (!ndev)
+                                     continue;

-                     if (ndev == netdev) {
+                             if (ndev == netdev) {
+                                     dev_put(ndev);
+                                     rdma_capable = true;
+                                     goto out;
+                             }
                               dev_put(ndev);

Why not move this dev_put up above the if (ndev == netdev) test? It's
needed in both cases, so it's confusing to have two copies.

I do agree that these two puts are very confusing. However, this code
structure comes from the original ksmbd_rdma_capable_netdev() and
this patch only indents it one more level and puts it in the if test for
get_netdev.

Also, while two puts look very weird, IMO putting it before the if
statement
is equally unintuitive as well since that would be testing the pointer after
its
reference is released. Although since no de-reference is happening here, it
might be fine. Please confirm this is good coding-style-wise and I will
include
this change in v2 of the patch.
There is no need to update code that does not have problems.

Well, there are now at least half a dozen stanzas of
	dev_put
	rdma_capable = <T|f>
	goto out

and I don't see why these can't be simplified to

	goto out_capable|out_notcapable

It woud be a lot clearer, at least to this reviewer. And more reliable
to maintain.



-                             rdma_capable = true;
-                             goto out;
+                     /* match physical ib_dev with IPoIB netdev by GUID
*/

Add more information to this comment, perhaps:

    /* if no exact netdev match, check for matching Infiniband GUID */


Fair point, will update this comment in v2.

+                     } else if (netdev->type == ARPHRD_INFINIBAND) {
+                             struct netdev_hw_addr *ha;
+                             union ib_gid gid;
+                             u32 port_num;
+                             int ret;
+
+                             netdev_hw_addr_list_for_each(
+                                     ha, &netdev->dev_addrs) {
+                                     memcpy(&gid, ha->addr + 4,
sizeof(gid));
+                                     ret = ib_find_gid(smb_dev->ib_dev,
&gid,
+                                                       &port_num,
NULL);
+                                     if (!ret) {
+                                             rdma_capable = true;
+                                             goto out;

Won't this leak the ndev? It needs a dev_put(ndev) before breaking
the loop, too, right?


This will not leak the ndev. Please look closer, this else branch matches
with
the if test on the existence of ops.get_netdev of the current ib_dev. And
ndev
is acquired only through that op. In the else branch, ndev is just not
acquired.
The original code was indented one more level here so the patch might look
a bit confusing, but one of the if before this block is a deleted(-) line.
You're right.

Ok, so I repeat my comment above!

Tom.


Please send v2 patch after adding comments that Tom pointed out.

Thanks!

+                                     }
+                             }
                       }
-                     dev_put(ndev);
               }
       }
   out:



--
Kangjing "Chaser" Huang





[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux