Re: [PATCH libmlx5 3/7] Add inline functions to read completion's attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 01, 2016 at 04:47:58PM +0300, Yishai Hadas wrote:
> Add inline functions in order to read various completion's
> attributes. These functions will be assigned in the ibv_cq_ex
> structure in order to allow the user to read the completion's
> attributes.

Did you look at using for these?

__attribute__((optimize("-fno-omit-frame-pointer")))

To help get rid of the stack frame?

> Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxxxx>
> Signed-off-by: Matan Barak <matanb@xxxxxxxxxxxx>
>  src/cq.c | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 121 insertions(+)
> 
> diff --git a/src/cq.c b/src/cq.c
> index a056787..188b34e 100644
> +++ b/src/cq.c
> @@ -850,6 +850,127 @@ int mlx5_poll_cq_v1(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc)
>  	return poll_cq(ibcq, ne, wc, 1);
>  }
>  
> +static inline enum ibv_wc_opcode mlx5_cq_read_wc_opcode(struct ibv_cq_ex *ibcq)
> +{
> +	fprintf(stderr, "un-expected opcode in cqe\n");

Something like this can be very expensive, depending on the version of
gcc it may force a stack frame to be created for all callers. You
should audit the assembly to be sure that gcc is not creating
stack frames.

> +static inline uint32_t mlx5_cq_read_wc_qp_num(struct ibv_cq_ex
> *ibcq)

You need to go through everything you've written and make sure it is
const-correct.
You should also annotate with restrict.

Doing this properly can increase performance by allowing the caller to
avoid reloads.

> +{
> +	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));

You might want to look at having the caller pass in 'cq' from a member
in ibv_cq_ex. That way the caller can hold the cq in a register
instead of constantly reloading it like this - I'm assuming to_mcq is
not just an container_of??

> +static inline int mlx5_cq_read_wc_flags(struct ibv_cq_ex *ibcq)
> +{
> +	struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq));
> +	int wc_flags = 0;
> +
> +	if (cq->flags & MLX5_CQ_FLAGS_RX_CSUM_VALID)
> +		wc_flags = (!!(cq->cqe64->hds_ip_ext & MLX5_CQE_L4_OK) &
> +				 !!(cq->cqe64->hds_ip_ext & MLX5_CQE_L3_OK) &
> +				 (get_cqe_l3_hdr_type(cq->cqe64) ==
> +				  MLX5_CQE_L3_HDR_TYPE_IPV4)) <<
> +				IBV_WC_IP_CSUM_OK_SHIFT;
> +
> +	switch (cq->cqe64->op_own >> 4) {
> +	case MLX5_CQE_RESP_WR_IMM:
> +	case MLX5_CQE_RESP_SEND_IMM:
> +		wc_flags	|= IBV_WC_WITH_IMM;
> +		break;
> +	}
> +
> +	wc_flags |= ((ntohl(cq->cqe64->flags_rqpn) >> 28) & 3) ? IBV_WC_GRH : 0;
> +	return wc_flags;
> +}

There is a disappointing amount of branching here...

Anyhow, if there was any doubt that we could use a generic inline
scheme I think this patch puts it too rest, the work being done in
most of the accessors is fairly complex.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux