Re: [PATCH] IB/mlx5: Reduce mlx5_ib_wq cacheline bouncing

Yann Droneaud <ydroneaud@xxxxxxxxxx> · Tue, 12 Jan 2016 15:37:11 +0100

Hi,

Le mardi 12 janvier 2016 à 12:32 +0200, Sagi Grimberg a écrit :
> mlx5 keeps a lot of internal accounting for wr processing.
> mlx5_ib_wq consists of multiple arrays:
> struct mlx5_ib_wq {
> 	u64		       *wrid;
> 	u32		       *wr_data;
> 	struct wr_list	       *w_list;
> 	unsigned	       *wqe_head;
> 	...
> }
> 
> Each time we access each of these arrays, even for a single index
> we fetch a cacheline. Reduce cacheline bounces by fitting these
> members
> in a cacheline aligned struct (swr_ctx) and allocate an array.
> Accessing
> this array will fetch all of these members in a single shot.
> 
> Since the receive queue needs only the wrid we use a nameless union
> where in the rwr_ctx we only have wrid member.
> 
> Signed-off-by: Sagi Grimberg <sagig@xxxxxxxxxxxx>
> ---
>  drivers/infiniband/hw/mlx5/cq.c      | 18 +++++++--------
>  drivers/infiniband/hw/mlx5/mlx5_ib.h | 21 +++++++++++++----
>  drivers/infiniband/hw/mlx5/qp.c      | 45 +++++++++++++++-----------
> ----------
>  3 files changed, 44 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h
> b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> index d4b227126265..84cb8fc072a1 100644
> --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
> +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> @@ -129,11 +129,24 @@ struct wr_list {
>  	u16	next;
>  };
>  
> +/* Please don't let this exceed a single cacheline */

Don't add a comment, add a compile time assert:

BUILD_BUG_ON(sizeof(struct swr_ctx) <= L1_CACHE_BYTES);

> +struct swr_ctx {
> +	u64		wrid;
> +	u32		wr_data;
> +	struct wr_list	w_list;
> +	u32		wqe_head;
> +	u8		rsvd[12];
> +}__packed;
> +

Packing the structure might make some fields unaligned and, on some
architecture, unaligned access are likely unwelcomed.

What about

struct swr_ctx {
	u64		wrid;
	u32		wr_data;
	struct wr_list	w_list;
	u32		wqe_head;
} ____cacheline_aligned;

> +struct rwr_ctx {
> +	u64		       wrid;
> +}__packed;
> +
>  struct mlx5_ib_wq {
> -	u64		       *wrid;
> -	u32		       *wr_data;
> -	struct wr_list	       *w_list;
> -	unsigned	       *wqe_head;
> +	union {
> +		struct swr_ctx *swr_ctx;
> +		struct rwr_ctx *rwr_ctx;
> +	};
>  	u16		        unsig_count;

Check the structure layout is the one you expect with pahole.

diff --git a/drivers/infiniband/hw/mlx5/qp.c
> b/drivers/infiniband/hw/mlx5/qp.c
> index 1ea049ed87da..a6b88902d7af 100644
> --- a/drivers/infiniband/hw/mlx5/qp.c
> +++ b/drivers/infiniband/hw/mlx5/qp.c
> @@ -794,14 +794,11 @@ static int create_kernel_qp(struct mlx5_ib_dev
> *dev,
>  		goto err_free;
>  	}
>  
> -	qp->sq.wrid = kmalloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wrid), GFP_KERNEL);
> -	qp->sq.wr_data = kmalloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wr_data), GFP_KERNEL);
> -	qp->rq.wrid = kmalloc(qp->rq.wqe_cnt * sizeof(*qp->rq.wrid), GFP_KERNEL);
> -	qp->sq.w_list = kmalloc(qp->sq.wqe_cnt * sizeof(*qp->sq.w_list), GFP_KERNEL);
> -	qp->sq.wqe_head = kmalloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wqe_head), GFP_KERNEL);
> -
> -	if (!qp->sq.wrid || !qp->sq.wr_data || !qp->rq.wrid ||
> -	    !qp->sq.w_list || !qp->sq.wqe_head) {
> +	qp->sq.swr_ctx = kcalloc(qp->sq.wqe_cnt, sizeof(*qp->sq.swr_ctx),
> +				 GFP_KERNEL);
> +	qp->rq.rwr_ctx = kcalloc(qp->rq.wqe_cnt, sizeof(*qp->sq.rwr_ctx),
> +				 GFP_KERNEL);
> 

Anyway, I'm not sure about the alignment of the memory returned by
kcalloc(), I should have known by the time but don't find time to
figure how it's handled on various SL*B allocator implementation.

Regards.

-- 
Yann Droneaud
OPTEYA

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html