> On Aug 29, 2018, at 2:17 PM, Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx> wrote: > > > >> -----Original Message----- >> From: Chuck Lever <chuck.lever@xxxxxxxxxx> >> Sent: Wednesday, August 29, 2018 1:09 PM >> To: Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx> >> Cc: linux-rdma <linux-rdma@xxxxxxxxxxxxxxx>; Leon Romanovsky >> <leon@xxxxxxxxxx> >> Subject: Re: Are device drivers ready for max_send_sge ? >> >> >> >>> On Aug 29, 2018, at 1:50 PM, Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx> >> wrote: >>> >>> Hey Chuck, >>> >>>> -----Original Message----- >>>> From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma- >>>> owner@xxxxxxxxxxxxxxx> On Behalf Of Chuck Lever >>>> Sent: Wednesday, August 29, 2018 11:49 AM >>>> To: linux-rdma <linux-rdma@xxxxxxxxxxxxxxx> >>>> Cc: Leon Romanovsky <leon@xxxxxxxxxx>; Steve Wise >>>> <swise@xxxxxxxxxxxxxxxxxxxxx> >>>> Subject: Are device drivers ready for max_send_sge ? >>>> >>>> In v4.19-rc1, the NFS/RDMA server has stopped working >>>> for me. >>>> >>>> The reason for this is that the mlx4 driver (with CX-3) >>>> reports 62 in >>>> >>>> ib_device_attr::max_send_sge >>>> >>>> But when the NFS server tries to create a QP with a >>>> qp_attr.cap.max_send_sge set to 62, rdma_create_qp >>>> fails with -EINVAL. The check that fails is in >>>> >>>> drivers/infiniband/hw/mlx4/qp.c :: set_kernel_sq_size >>>> >>>> It's comparing the passed-in max_send_sge against the min() >>>> of dev->dev->caps.max_sq_sg and dev->dev->caps.max_rq_sg, >>>> and obviously that fails because max_rq_sg is smaller than >>>> max_sq_sg. >>>> >>>> set_rq_size() also has similar dependencies on max_sq_sg >>>> that may no longer be appropriate. >>>> >>> >>> These need fixing for sure. >>> >>>> When I fix the first sanity check in set_kernel_sq_size >>>> to ignore max_rq_sg, the third check in set_kernel_sq_size >>>> fails. This is: >>>> >>>> s = max(cap->max_send_sge * sizeof (struct mlx4_wqe_data_seg), >>>> cap->max_inline_data + sizeof (struct mlx4_wqe_inline_seg)) + >>>> send_wqe_overhead(type, qp->flags); >>>> >>>> if (s > dev->dev->caps.max_sq_desc_sz) >>>> >>>> I don't know enough about this logic to suggest a fix. >>>> >>> >>> If you set ib_device_attr::max_send_sge to 61, does it then work? Just >>> wonder if this is a 1-off error. >> >> I made this change to reduce ib_device_attr::max_send_sge >> to 61: >> >> diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c >> b/drivers/net/ethernet/mellanox/mlx4/fw.c >> index babcfd9..81874f6 100644 >> --- a/drivers/net/ethernet/mellanox/mlx4/fw.c >> +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c >> @@ -950,7 +950,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, >> struct mlx4_dev_cap *dev_cap) >> } >> >> MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_SG_SQ_OFFSET); >> - dev_cap->max_sq_sg = field; >> + dev_cap->max_sq_sg = field - 1; >> MLX4_GET(size, outbox, QUERY_DEV_CAP_MAX_DESC_SZ_SQ_OFFSET); >> dev_cap->max_sq_desc_sz = size; >> >> >> I added printk's to set_kernel_sq_size to capture the parameters >> of the failing sanity check. >> >> Here's with max_send_sge == 62: >> >> Aug 29 12:26:01 klimt kernel: <mlx4_ib> set_kernel_sq_size: s = 1056 >> Aug 29 12:26:01 klimt kernel: <mlx4_ib> set_kernel_sq_size: dev->dev- >>> caps.max_sq_desc_sz = 1008 >> >> and with max_send_sge == 61: >> >> Aug 29 13:59:40 klimt kernel: <mlx4_ib> set_kernel_sq_size: s = 1040 >> Aug 29 13:59:40 klimt kernel: <mlx4_ib> set_kernel_sq_size: dev->dev- >>> caps.max_sq_desc_sz = 1008 >> > > I guess mlx4_ib_query_device() could be enhanced to limit the max: > > ib_device_attr::max_send_sge = min(fw-advertised-max, (max_sq_desc_space - > max_wqe_overhead) / sge_descriptor_size); Probably. I also see that the NFS server has no reason to request so many send SGEs. -- Chuck Lever