On Thu, Oct 19, 2023 at 9:14 PM Tom Talpey <tom@xxxxxxxxxx> wrote: > > On 10/19/2023 6:37 PM, Namjae Jeon wrote: > > 2023-10-19 15:01 GMT+09:00, Kangjing (Chaser) Huang <huangkangjing@xxxxxxxxx>: > >> Thank you for the review. Also thanks a lot to Namjae for submitting > >> this patch on my behalf. > >> > >> > >> On Tue, Oct 17, 2023 at 10:07 AM Tom Talpey <tom@xxxxxxxxxx> wrote: > >>> > >>> On 10/15/2023 10:45 AM, Namjae Jeon wrote: > >>>> From: Kangjing Huang <huangkangjing@xxxxxxxxx> > >>>> > >>>> Physical ib_device does not have an underlying net_device, thus its > >>>> association with IPoIB net_device cannot be retrieved via > >>>> ops.get_netdev() or ib_device_get_by_netdev(). ksmbd reads physical > >>>> ib_device port GUID from the lower 16 bytes of the hardware addresses > >>>> on > >>>> IPoIB net_device and match its underlying ib_device using ib_find_gid() > >>>> > >>>> Signed-off-by: Kangjing Huang <huangkangjing@xxxxxxxxx> > >>>> Acked-by: Namjae Jeon <linkinjeon@xxxxxxxxxx> > >>>> --- > >>>> fs/smb/server/transport_rdma.c | 39 > >>>> +++++++++++++++++++++++++--------- > >>>> 1 file changed, 29 insertions(+), 10 deletions(-) > >>>> > >>>> diff --git a/fs/smb/server/transport_rdma.c > >>>> b/fs/smb/server/transport_rdma.c > >>>> index 3b269e1f523a..a82131f7dd83 100644 > >>>> --- a/fs/smb/server/transport_rdma.c > >>>> +++ b/fs/smb/server/transport_rdma.c > >>>> @@ -2140,8 +2140,7 @@ static int smb_direct_ib_client_add(struct > >>>> ib_device *ib_dev) > >>>> if (ib_dev->node_type != RDMA_NODE_IB_CA) > >>>> smb_direct_port = SMB_DIRECT_PORT_IWARP; > >>>> > >>>> - if (!ib_dev->ops.get_netdev || > >>>> - !rdma_frwr_is_supported(&ib_dev->attrs)) > >>>> + if (!rdma_frwr_is_supported(&ib_dev->attrs)) > >>>> return 0; > >>>> > >>>> smb_dev = kzalloc(sizeof(*smb_dev), GFP_KERNEL); > >>>> @@ -2241,17 +2240,37 @@ bool ksmbd_rdma_capable_netdev(struct net_device > >>>> *netdev) > >>>> for (i = 0; i < smb_dev->ib_dev->phys_port_cnt; i++) { > >>>> struct net_device *ndev; > >>>> > >>>> - ndev = > >>>> smb_dev->ib_dev->ops.get_netdev(smb_dev->ib_dev, > >>>> - i + 1); > >>>> - if (!ndev) > >>>> - continue; > >>>> + /* RoCE and iWRAP ib_dev is backed by a netdev */ > >>>> + if (smb_dev->ib_dev->ops.get_netdev) { > >>> > >>> The "IWRAP" is a typo, but IMO the comment is misleading. This is simply > >>> looking up the target netdev, it's not limited to these two rdma types. > >>> I suggest deleting the comment. > >>> > >> > >> I agree with this point and will update this comment in version 2 of the > >> patch. > >> > >>>> + ndev = smb_dev->ib_dev->ops.get_netdev( > >>>> + smb_dev->ib_dev, i + 1); > >>>> + if (!ndev) > >>>> + continue; > >>>> > >>>> - if (ndev == netdev) { > >>>> + if (ndev == netdev) { > >>>> + dev_put(ndev); > >>>> + rdma_capable = true; > >>>> + goto out; > >>>> + } > >>>> dev_put(ndev); > >>> > >>> Why not move this dev_put up above the if (ndev == netdev) test? It's > >>> needed in both cases, so it's confusing to have two copies. > >> > >> I do agree that these two puts are very confusing. However, this code > >> structure comes from the original ksmbd_rdma_capable_netdev() and > >> this patch only indents it one more level and puts it in the if test for > >> get_netdev. > >> > >> Also, while two puts look very weird, IMO putting it before the if > >> statement > >> is equally unintuitive as well since that would be testing the pointer after > >> its > >> reference is released. Although since no de-reference is happening here, it > >> might be fine. Please confirm this is good coding-style-wise and I will > >> include > >> this change in v2 of the patch. > > There is no need to update code that does not have problems. > > Well, there are now at least half a dozen stanzas of > dev_put > rdma_capable = <T|f> > goto out > > and I don't see why these can't be simplified to > > goto out_capable|out_notcapable > > It woud be a lot clearer, at least to this reviewer. And more reliable > to maintain. > I agree, but this consolidation would be impossible without a major overhaul of the code structure in ksmbd_rdma_capable_netdev(). There are several clean-up calls that need to be made (read_unlock(&smb_direct_device_lock), dev_put, ib_device_put) and the biggest problem is brought in by the block of code that is doing a reverse lookup on ib_dev structs right after the patched loop. I don't know why it is necessary for such a reverse lookup, but given that these code depends on the behaviors of ib device drivers, which can be very erratic. I feel that removing that block could break something with some other devices. In conclusion, I feel addressing these issues in this function is beyond the scope of this patch and they should be addressed separately. I will move forward with the comment change in v2. > >> > >>> > >>>> - rdma_capable = true; > >>>> - goto out; > >>>> + /* match physical ib_dev with IPoIB netdev by GUID > >>>> */ > >>> > >>> Add more information to this comment, perhaps: > >>> > >>> /* if no exact netdev match, check for matching Infiniband GUID */ > >>> > >> > >> Fair point, will update this comment in v2. > >> > >>>> + } else if (netdev->type == ARPHRD_INFINIBAND) { > >>>> + struct netdev_hw_addr *ha; > >>>> + union ib_gid gid; > >>>> + u32 port_num; > >>>> + int ret; > >>>> + > >>>> + netdev_hw_addr_list_for_each( > >>>> + ha, &netdev->dev_addrs) { > >>>> + memcpy(&gid, ha->addr + 4, > >>>> sizeof(gid)); > >>>> + ret = ib_find_gid(smb_dev->ib_dev, > >>>> &gid, > >>>> + &port_num, > >>>> NULL); > >>>> + if (!ret) { > >>>> + rdma_capable = true; > >>>> + goto out; > >>> > >>> Won't this leak the ndev? It needs a dev_put(ndev) before breaking > >>> the loop, too, right? > >>> > >> > >> This will not leak the ndev. Please look closer, this else branch matches > >> with > >> the if test on the existence of ops.get_netdev of the current ib_dev. And > >> ndev > >> is acquired only through that op. In the else branch, ndev is just not > >> acquired. > >> The original code was indented one more level here so the patch might look > >> a bit confusing, but one of the if before this block is a deleted(-) line. > > You're right. > > Ok, so I repeat my comment above! > > Tom. > > > > > Please send v2 patch after adding comments that Tom pointed out. > > > > Thanks! > >> > >>>> + } > >>>> + } > >>>> } > >>>> - dev_put(ndev); > >>>> } > >>>> } > >>>> out: > >> > >> > >> > >> -- > >> Kangjing "Chaser" Huang > >> > > -- Kangjing "Chaser" Huang