Are device drivers ready for max_send_sge ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In v4.19-rc1, the NFS/RDMA server has stopped working
for me.

The reason for this is that the mlx4 driver (with CX-3)
reports 62 in

  ib_device_attr::max_send_sge

But when the NFS server tries to create a QP with a
qp_attr.cap.max_send_sge set to 62, rdma_create_qp
fails with -EINVAL. The check that fails is in

  drivers/infiniband/hw/mlx4/qp.c :: set_kernel_sq_size

It's comparing the passed-in max_send_sge against the min()
of dev->dev->caps.max_sq_sg and dev->dev->caps.max_rq_sg,
and obviously that fails because max_rq_sg is smaller than
max_sq_sg.

set_rq_size() also has similar dependencies on max_sq_sg
that may no longer be appropriate.

When I fix the first sanity check in set_kernel_sq_size
to ignore max_rq_sg, the third check in set_kernel_sq_size
fails. This is:

  s = max(cap->max_send_sge * sizeof (struct mlx4_wqe_data_seg),
          cap->max_inline_data + sizeof (struct mlx4_wqe_inline_seg)) +
          send_wqe_overhead(type, qp->flags);

  if (s > dev->dev->caps.max_sq_desc_sz)

I don't know enough about this logic to suggest a fix.

Is there a driver-level fix in the works, or should I
consider changing the NFS server to compute a smaller
qp_attr.cap.max_send_sge ?

--
Chuck Lever






[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux