Re: Are device drivers ready for max_send_sge ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 29, 2018 at 04:41:42PM -0400, Chuck Lever wrote:
>
>
> > On Aug 29, 2018, at 2:52 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >
> >
> >
> >> On Aug 29, 2018, at 2:17 PM, Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx> wrote:
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> >>> Sent: Wednesday, August 29, 2018 1:09 PM
> >>> To: Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx>
> >>> Cc: linux-rdma <linux-rdma@xxxxxxxxxxxxxxx>; Leon Romanovsky
> >>> <leon@xxxxxxxxxx>
> >>> Subject: Re: Are device drivers ready for max_send_sge ?
> >>>
> >>>
> >>>
> >>>> On Aug 29, 2018, at 1:50 PM, Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx>
> >>> wrote:
> >>>>
> >>>> Hey Chuck,
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma-
> >>>>> owner@xxxxxxxxxxxxxxx> On Behalf Of Chuck Lever
> >>>>> Sent: Wednesday, August 29, 2018 11:49 AM
> >>>>> To: linux-rdma <linux-rdma@xxxxxxxxxxxxxxx>
> >>>>> Cc: Leon Romanovsky <leon@xxxxxxxxxx>; Steve Wise
> >>>>> <swise@xxxxxxxxxxxxxxxxxxxxx>
> >>>>> Subject: Are device drivers ready for max_send_sge ?
> >>>>>
> >>>>> In v4.19-rc1, the NFS/RDMA server has stopped working
> >>>>> for me.
> >>>>>
> >>>>> The reason for this is that the mlx4 driver (with CX-3)
> >>>>> reports 62 in
> >>>>>
> >>>>> ib_device_attr::max_send_sge
> >>>>>
> >>>>> But when the NFS server tries to create a QP with a
> >>>>> qp_attr.cap.max_send_sge set to 62, rdma_create_qp
> >>>>> fails with -EINVAL. The check that fails is in
> >>>>>
> >>>>> drivers/infiniband/hw/mlx4/qp.c :: set_kernel_sq_size
> >>>>>
> >>>>> It's comparing the passed-in max_send_sge against the min()
> >>>>> of dev->dev->caps.max_sq_sg and dev->dev->caps.max_rq_sg,
> >>>>> and obviously that fails because max_rq_sg is smaller than
> >>>>> max_sq_sg.
> >>>>>
> >>>>> set_rq_size() also has similar dependencies on max_sq_sg
> >>>>> that may no longer be appropriate.
> >>>>>
> >>>>
> >>>> These need fixing for sure.
> >>>>
> >>>>> When I fix the first sanity check in set_kernel_sq_size
> >>>>> to ignore max_rq_sg, the third check in set_kernel_sq_size
> >>>>> fails. This is:
> >>>>>
> >>>>> s = max(cap->max_send_sge * sizeof (struct mlx4_wqe_data_seg),
> >>>>>        cap->max_inline_data + sizeof (struct mlx4_wqe_inline_seg)) +
> >>>>>        send_wqe_overhead(type, qp->flags);
> >>>>>
> >>>>> if (s > dev->dev->caps.max_sq_desc_sz)
> >>>>>
> >>>>> I don't know enough about this logic to suggest a fix.
> >>>>>
> >>>>
> >>>> If you set ib_device_attr::max_send_sge to 61, does it then work?  Just
> >>>> wonder if this is a 1-off error.
> >>>
> >>> I made this change to reduce ib_device_attr::max_send_sge
> >>> to 61:
> >>>
> >>> diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c
> >>> b/drivers/net/ethernet/mellanox/mlx4/fw.c
> >>> index babcfd9..81874f6 100644
> >>> --- a/drivers/net/ethernet/mellanox/mlx4/fw.c
> >>> +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
> >>> @@ -950,7 +950,7 @@ int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev,
> >>> struct mlx4_dev_cap *dev_cap)
> >>>       }
> >>>
> >>>       MLX4_GET(field, outbox, QUERY_DEV_CAP_MAX_SG_SQ_OFFSET);
> >>> -       dev_cap->max_sq_sg = field;
> >>> +       dev_cap->max_sq_sg = field - 1;
> >>>       MLX4_GET(size, outbox, QUERY_DEV_CAP_MAX_DESC_SZ_SQ_OFFSET);
> >>>       dev_cap->max_sq_desc_sz = size;
> >>>
> >>>
> >>> I added printk's to set_kernel_sq_size to capture the parameters
> >>> of the failing sanity check.
> >>>
> >>> Here's with max_send_sge == 62:
> >>>
> >>> Aug 29 12:26:01 klimt kernel: <mlx4_ib> set_kernel_sq_size: s = 1056
> >>> Aug 29 12:26:01 klimt kernel: <mlx4_ib> set_kernel_sq_size: dev->dev-
> >>>> caps.max_sq_desc_sz = 1008
> >>>
> >>> and with max_send_sge == 61:
> >>>
> >>> Aug 29 13:59:40 klimt kernel: <mlx4_ib> set_kernel_sq_size: s = 1040
> >>> Aug 29 13:59:40 klimt kernel: <mlx4_ib> set_kernel_sq_size: dev->dev-
> >>>> caps.max_sq_desc_sz = 1008
> >>>
> >>
> >> I guess mlx4_ib_query_device() could be enhanced to limit the max:
> >>
> >> ib_device_attr::max_send_sge = min(fw-advertised-max, (max_sq_desc_space -
> >> max_wqe_overhead) / sge_descriptor_size);
> >
> > Probably. I also see that the NFS server has no reason to
> > request so many send SGEs.
>
> I've now tested a couple other cards/drivers. mlx4/CX-3 is the
> only one with this problem I've found so far.

Thanks Chuck and Steve,

I checked it in our mlx4 HW spec and it seems that the check in
set_kernel_sq_size() is the right one. We should take into account
the following four maximum parameters: max_sg_rq, max_sg_sq,
max_desc_sz_rq and max_desc_sz_sq.

However instead of bringing this complexity to query_device, which still
won't be sufficient anyway (the calculations are dependent on QP type),
the safer approach will be to restore old code, which will give you 32 SGEs.

Thanks

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index ca0f1ee26091..21e268247f4d 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -517,8 +517,8 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
 	props->page_size_cap	   = dev->dev->caps.page_size_cap;
 	props->max_qp		   = dev->dev->quotas.qp;
 	props->max_qp_wr	   = dev->dev->caps.max_wqes - MLX4_IB_SQ_MAX_SPARE;
-	props->max_send_sge	   = dev->dev->caps.max_sq_sg;
-	props->max_recv_sge	   = dev->dev->caps.max_rq_sg;
+	props->max_send_sge	   = min(dev->dev->caps.max_sq_sg, dev->dev->caps.max_rq_sg);
+	props->max_recv_sge	   = min(dev->dev->caps.max_sq_sg, dev->dev->caps.max_rq_sg);
 	props->max_sge_rd	   = MLX4_MAX_SGE_RD;
 	props->max_cq		   = dev->dev->quotas.cq;
 	props->max_cqe		   = dev->dev->caps.max_cqes;

>
>
> --
> Chuck Lever
>
>
>

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux