Re: [PATCH rdma-next] IB/mlx4: Use 4K pages for kernel QP's WQE buffer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 26, 2018 at 10:08:37AM +0300, Leon Romanovsky wrote:
> From: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx>
> 
> In the current implementation, the driver tries to allocate contiguous
> memory, and if it fails, it falls back to 4K fragmented allocation.
> 
> Once the memory is fragmented, the first allocation might take a lot
> of time, and even fail, which can cause connection failures.
> 
> This patch changes the logic to always allocate with 4K granularity,
> since it's more robust and more likely to succeed.
> 
> This patch was tested with Lustre and no performance degradation
> was observed.
> 
> Note: This commit eliminates the "shrinking WQE" feature. This feature
> depended on using vmap to create a virtually contiguous send WQ.
> vmap use was abandoned due to problems with several processors (see the
> commit cited below). As a result, shrinking WQE was available only with
> physically contiguous send WQs. Allocating such send WQs caused the
> problems described above.
> Therefore, as a side effect of eliminating the use of large physically
> contiguous send WQs, the shrinking WQE feature became unavailable.
> 
> Warning example:
> worker/20:1: page allocation failure: order:8, mode:0x80d0
> CPU: 20 PID: 513 Comm: kworker/20:1 Tainted: G OE ------------
> Workqueue: ib_cm cm_work_handler [ib_cm]
> Call Trace:
> [<ffffffff81686d81>] dump_stack+0x19/0x1b
> [<ffffffff81186160>] warn_alloc_failed+0x110/0x180
> [<ffffffff8118a954>] __alloc_pages_nodemask+0x9b4/0xba0
> [<ffffffff811ce868>] alloc_pages_current+0x98/0x110
> [<ffffffff81184fae>] __get_free_pages+0xe/0x50
> [<ffffffff8133f6fe>] swiotlb_alloc_coherent+0x5e/0x150
> [<ffffffff81062551>] x86_swiotlb_alloc_coherent+0x41/0x50
> [<ffffffffa056b4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core]
> [<ffffffffa056b73b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core]
> [<ffffffffa0b15496>] create_qp_common+0x536/0x1000 [mlx4_ib]
> [<ffffffff811c6ef7>] ? dma_pool_free+0xa7/0xd0
> [<ffffffffa0b163c1>] mlx4_ib_create_qp+0x3b1/0xdc0 [mlx4_ib]
> [<ffffffffa0b01bc2>] ? mlx4_ib_create_cq+0x2d2/0x430 [mlx4_ib]
> [<ffffffffa0b21f20>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib]
> [<ffffffffa08f152a>] ib_create_qp+0x7a/0x2f0 [ib_core]
> [<ffffffffa06205d4>] rdma_create_qp+0x34/0xb0 [rdma_cm]
> [<ffffffffa08275c9>] kiblnd_create_conn+0xbf9/0x1950 [ko2iblnd]
> [<ffffffffa074077a>] ? cfs_percpt_unlock+0x1a/0xb0 [libcfs]
> [<ffffffffa0835519>] kiblnd_passive_connect+0xa99/0x18c0 [ko2iblnd]
> 
> Fixes: 73898db04301 ("net/mlx4: Avoid wrong virtual mappings")
> Signed-off-by: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> ---
>  drivers/infiniband/hw/mlx4/mlx4_ib.h |   1 -
>  drivers/infiniband/hw/mlx4/qp.c      | 209 ++++++-----------------------------
>  2 files changed, 34 insertions(+), 176 deletions(-)

Applied to for-next

Thanks,
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux