Re: [PATCH v2 1/8] IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "Sagi Grimberg" <sagi@xxxxxxxxxxx>
> To: "Bart Van Assche" <bart.vanassche@xxxxxxxxxxx>, "Doug Ledford" <dledford@xxxxxxxxxx>
> Cc: linux-rdma@xxxxxxxxxxxxxxx, "Israel Rukshin" <israelr@xxxxxxxxxxxx>, "Max Gurtovoy" <maxg@xxxxxxxxxxxx>, "Leon
> Romanovsky" <leonro@xxxxxxxxxxxx>, "Mark Bloch" <markb@xxxxxxxxxxxx>, "Yuval Shaia" <yuval.shaia@xxxxxxxxxx>, "# 4 .
> 7+" <stable@xxxxxxxxxxxxxxx>
> Sent: Wednesday, February 15, 2017 10:38:06 AM
> Subject: Re: [PATCH v2 1/8] IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS
> 
> 
> > Tests have shown that the following error message is reported when
> > using SG-GAPS registration with an mlx5 adapter:
> >
> > scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE
> > ffff880bd4270eb0
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 0f007806 2500002a ad9fafd1
> > scsi host1: ib_srp: reconnect succeeded
> > mlx5_0:dump_cqe:262:(pid 7369): dump error cqe
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 0f007806 25000032 00105dd0
> > scsi host1: ib_srp: failed FAST REG status memory management operation
> > error (6) for CQE ffff880b92860138
> >
> > Hence avoid using SG-GAPS memory registrations. Additionally,
> > always configure the blk_queue_virt_boundary() to avoid to trigger
> > a mapping failure when using adapters that support SG-GAPS (e.g.
> > mlx5).
> 
> Hi Guys,
> 
> Sorry for addressing this late, but has this failure been investigated?
> 
> Max, Israel, what does this error syndrome map to?
> 
> Looking at mlx5_ib_sg_to_klms, I think the mr->length is incorrectly
> incremented. Does the following change fix the problem?
> --
> diff --git a/drivers/infiniband/hw/mlx5/mr.c
> b/drivers/infiniband/hw/mlx5/mr.c
> index 8f608debe141..c21c9eee37f6 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1832,7 +1832,7 @@ mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr,
>                  klms[i].va = cpu_to_be64(sg_dma_address(sg) + sg_offset);
>                  klms[i].bcount = cpu_to_be32(sg_dma_len(sg) - sg_offset);
>                  klms[i].key = cpu_to_be32(lkey);
> -               mr->ibmr.length += sg_dma_len(sg);
> +               mr->ibmr.length += sg_dma_len(sg) - sg_offset;
> 
>                  sg_offset = 0;
>          }
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Started with Linus's tree, applied the change requested by Sagi, built the kernel, rebooted and started the tests.

Linux ibclient 4.10.0-rc8.sagi+ #1 SMP Wed Feb 15 11:09:44 EST 2017 x86_64 x86_64 x86_64 GNU/Linux

Very quickly get to this

[  180.990285] mlx5_0:dump_cqe:262:(pid 0): dump error cqe
[  181.016899] 00000000 00000000 00000000 00000000
[  181.040949] 00000000 00000000 00000000 00000000
[  181.066960] 00000000 00000000 00000000 00000000
[  181.092030] 00000000 0f007806 2500002a bf1913d0
[  181.117254] scsi host2: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff880bdbe88778
[  196.288933] fast_io_fail_tmo expired for SRP port-2:1 / host2.
[  197.090886] scsi host2: ib_srp: reconnect succeeded
[  197.127628] scsi host2: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817f09b6f30

So does not help.
I think my and Barts suggestion to revert for now is the best way forward.
I have already tested this in-depth from Bart's tree and its been sent to Doug as V2 of Bart'recent 8 patch series.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux