Re: [PATCH v2 1/8] IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "Leon Romanovsky" <leonro@xxxxxxxxxxxx>
> To: "Bart Van Assche" <bart.vanassche@xxxxxxxxxxx>, "Max Gurtovoy" <maxg@xxxxxxxxxxxx>
> Cc: "Doug Ledford" <dledford@xxxxxxxxxx>, linux-rdma@xxxxxxxxxxxxxxx, "Israel Rukshin" <israelr@xxxxxxxxxxxx>, "Mark
> Bloch" <markb@xxxxxxxxxxxx>, "Yuval Shaia" <yuval.shaia@xxxxxxxxxx>, "Artemy Kovalyov" <artemyko@xxxxxxxxxxxx>, "# 4
> . 7+" <stable@xxxxxxxxxxxxxxx>
> Sent: Wednesday, February 15, 2017 3:19:45 AM
> Subject: Re: [PATCH v2 1/8] IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS
> 
> On Wed, Feb 15, 2017 at 09:14:49AM +0200, Leon Romanovsky wrote:
> > On Tue, Feb 14, 2017 at 10:56:29AM -0800, Bart Van Assche wrote:
> > > Tests have shown that the following error message is reported when
> > > using SG-GAPS registration with an mlx5 adapter:
> > >
> > > scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE
> > > ffff880bd4270eb0
> > > 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000 00000000
> > > 00000000 0f007806 2500002a ad9fafd1
> > > scsi host1: ib_srp: reconnect succeeded
> > > mlx5_0:dump_cqe:262:(pid 7369): dump error cqe
> > > 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000 00000000
> > > 00000000 0f007806 25000032 00105dd0
> > > scsi host1: ib_srp: failed FAST REG status memory management operation
> > > error (6) for CQE ffff880b92860138
> > >
> > > Hence avoid using SG-GAPS memory registrations. Additionally,
> > > always configure the blk_queue_virt_boundary() to avoid to trigger
> > > a mapping failure when using adapters that support SG-GAPS (e.g.
> > > mlx5).
> >
> > According to the error dump, we have an issue with max_page_list_len
> > supplied and/or
> > internal calculations from that value to the UMR byte count.
> 
> Hi Bart,
> 
> Do you mind to try your test on my branch rdma-next [1] with the following
> fixup?
> 
> diff --git a/drivers/infiniband/hw/mlx5/mr.c
> b/drivers/infiniband/hw/mlx5/mr.c
> index 3c1f483d003f..3e59dce10d5e 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1045,8 +1045,9 @@ int mlx5_ib_update_xlt(struct mlx5_ib_mr *mr, u64 idx,
> int npages,
>  	for (pages_mapped = 0;
>  	     pages_mapped < pages_to_map && !err;
>  	     pages_mapped += pages_iter, idx += pages_iter) {
> +		npages = min_t(int, pages_iter, pages_to_map - pages_mapped);
>  		dma_sync_single_for_cpu(ddev, dma, size, DMA_TO_DEVICE);
> -		npages = populate_xlt(mr, idx, pages_iter, xlt,
> +		npages = populate_xlt(mr, idx, npages, xlt,
>  				      page_shift, size, flags);
> 
>  		dma_sync_single_for_device(ddev, dma, size, DMA_TO_DEVICE);
> 
> [1]
> https://git.kernel.org/cgit/linux/kernel/git/leon/linux-rdma.git/log/?h=rdma-next
> 
> Thanks
> 

Hello Leon
Replied earlier but I dont know if my reply made it.
I will have to test this.

is this repo https://git.kernel.org/cgit/linux/kernel/git/leon/linux-rdma.git/log/?h=rdma-next already patched with the change you want.
If not can I just take the patch and apply to my earlier tree based just on Linus's tree where I reverted the patch.

Thanks
Laurence



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]