On Wed, Feb 15, 2017 at 06:18:02PM +0200, Max Gurtovoy wrote: > > > On 2/15/2017 5:38 PM, Sagi Grimberg wrote: > > > > > Tests have shown that the following error message is reported when > > > using SG-GAPS registration with an mlx5 adapter: > > > > > > scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE > > > ffff880bd4270eb0 > > > 00000000 00000000 00000000 00000000 > > > 00000000 00000000 00000000 00000000 > > > 00000000 00000000 00000000 00000000 > > > 00000000 0f007806 2500002a ad9fafd1 > > > scsi host1: ib_srp: reconnect succeeded > > > mlx5_0:dump_cqe:262:(pid 7369): dump error cqe > > > 00000000 00000000 00000000 00000000 > > > 00000000 00000000 00000000 00000000 > > > 00000000 00000000 00000000 00000000 > > > 00000000 0f007806 25000032 00105dd0 > > > scsi host1: ib_srp: failed FAST REG status memory management operation > > > error (6) for CQE ffff880b92860138 > > > > > > Hence avoid using SG-GAPS memory registrations. Additionally, > > > always configure the blk_queue_virt_boundary() to avoid to trigger > > > a mapping failure when using adapters that support SG-GAPS (e.g. > > > mlx5). > > > > Hi Guys, > > > > Sorry for addressing this late, but has this failure been investigated? > > > > Max, Israel, what does this error syndrome map to? > > Sagi, > this syndrome says that number of klms to write is bigger than number of > mtts. > > Artemy started investigating it and proposed solution that were tested by > Laurence. > Let's see if your fix will help. No, Artemy's change doesn't fix it. > > > > > Looking at mlx5_ib_sg_to_klms, I think the mr->length is incorrectly > > incremented. Does the following change fix the problem? > > -- > > diff --git a/drivers/infiniband/hw/mlx5/mr.c > > b/drivers/infiniband/hw/mlx5/mr.c > > index 8f608debe141..c21c9eee37f6 100644 > > --- a/drivers/infiniband/hw/mlx5/mr.c > > +++ b/drivers/infiniband/hw/mlx5/mr.c > > @@ -1832,7 +1832,7 @@ mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr, > > klms[i].va = cpu_to_be64(sg_dma_address(sg) + sg_offset); > > klms[i].bcount = cpu_to_be32(sg_dma_len(sg) - sg_offset); > > klms[i].key = cpu_to_be32(lkey); > > - mr->ibmr.length += sg_dma_len(sg); > > + mr->ibmr.length += sg_dma_len(sg) - sg_offset; > > > > sg_offset = 0; > > } > > -- > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature