On Thu, Jul 26, 2018 at 10:08:37AM +0300, Leon Romanovsky wrote: > From: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx> > > In the current implementation, the driver tries to allocate contiguous > memory, and if it fails, it falls back to 4K fragmented allocation. > > Once the memory is fragmented, the first allocation might take a lot > of time, and even fail, which can cause connection failures. > > This patch changes the logic to always allocate with 4K granularity, > since it's more robust and more likely to succeed. > > This patch was tested with Lustre and no performance degradation > was observed. > > Note: This commit eliminates the "shrinking WQE" feature. This feature > depended on using vmap to create a virtually contiguous send WQ. > vmap use was abandoned due to problems with several processors (see the > commit cited below). As a result, shrinking WQE was available only with > physically contiguous send WQs. Allocating such send WQs caused the > problems described above. > Therefore, as a side effect of eliminating the use of large physically > contiguous send WQs, the shrinking WQE feature became unavailable. > > Warning example: > worker/20:1: page allocation failure: order:8, mode:0x80d0 > CPU: 20 PID: 513 Comm: kworker/20:1 Tainted: G OE ------------ > Workqueue: ib_cm cm_work_handler [ib_cm] > Call Trace: > [<ffffffff81686d81>] dump_stack+0x19/0x1b > [<ffffffff81186160>] warn_alloc_failed+0x110/0x180 > [<ffffffff8118a954>] __alloc_pages_nodemask+0x9b4/0xba0 > [<ffffffff811ce868>] alloc_pages_current+0x98/0x110 > [<ffffffff81184fae>] __get_free_pages+0xe/0x50 > [<ffffffff8133f6fe>] swiotlb_alloc_coherent+0x5e/0x150 > [<ffffffff81062551>] x86_swiotlb_alloc_coherent+0x41/0x50 > [<ffffffffa056b4c4>] mlx4_buf_direct_alloc.isra.7+0xc4/0x180 [mlx4_core] > [<ffffffffa056b73b>] mlx4_buf_alloc+0x1bb/0x260 [mlx4_core] > [<ffffffffa0b15496>] create_qp_common+0x536/0x1000 [mlx4_ib] > [<ffffffff811c6ef7>] ? dma_pool_free+0xa7/0xd0 > [<ffffffffa0b163c1>] mlx4_ib_create_qp+0x3b1/0xdc0 [mlx4_ib] > [<ffffffffa0b01bc2>] ? mlx4_ib_create_cq+0x2d2/0x430 [mlx4_ib] > [<ffffffffa0b21f20>] mlx4_ib_create_qp_wrp+0x10/0x20 [mlx4_ib] > [<ffffffffa08f152a>] ib_create_qp+0x7a/0x2f0 [ib_core] > [<ffffffffa06205d4>] rdma_create_qp+0x34/0xb0 [rdma_cm] > [<ffffffffa08275c9>] kiblnd_create_conn+0xbf9/0x1950 [ko2iblnd] > [<ffffffffa074077a>] ? cfs_percpt_unlock+0x1a/0xb0 [libcfs] > [<ffffffffa0835519>] kiblnd_passive_connect+0xa99/0x18c0 [ko2iblnd] > > Fixes: 73898db04301 ("net/mlx4: Avoid wrong virtual mappings") > Signed-off-by: Jack Morgenstein <jackm@xxxxxxxxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > --- > drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 - > drivers/infiniband/hw/mlx4/qp.c | 209 ++++++----------------------------- > 2 files changed, 34 insertions(+), 176 deletions(-) Applied to for-next Thanks, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html