Re: [PATCH rdma-core 4/5] verbs: Add alloc_null_mr verb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 20, 2018 at 07:28:21PM +0300, Yishai Hadas wrote:
> From: Yonatan Cohen <yonatanc@xxxxxxxxxxxx>
> 
> ibv_alloc_null_mr() allocates a null memory region (MR) that is associated
> with the protection domain PD.
> A null MR does not map any specific address.
> It is used to force local HCA operations to skip the PCI bus access, while
> keeping track of the processed length in the ibv_sge handling.
> Meaning, instead of a PCI write access the HCA leaves the target memory
> untouched, and skips filling that packet section.
> Similar behavior is done upon send, the HCA skips data which is pointed
> by that null MR and saves PCI bus access.
> This functionality saves PCI read/write operations and improve performance.
> The MR's member lkey is used as the lkey field of struct ibv_sge when
> posting buffers with ibv_post_* verbs.
> The ibv_mr member addr will be NULL, length will be SIZE_MAX, and the
> rkey will be zero, as they are irrelevant.
> ibv_dereg_mr() deregisters the MR.
> The use of ibv_rereg_mr() or ibv_bind_mw() with this MR is invalid.
> 
> Signed-off-by: Yonatan Cohen <yonatanc@xxxxxxxxxxxx>
> Reviewed-by: Guy Levi <guyle@xxxxxxxxxxxx>
> Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxxxx>
>  libibverbs/driver.h                   |  2 ++
>  libibverbs/dummy_ops.c                |  8 +++++
>  libibverbs/man/CMakeLists.txt         |  1 +
>  libibverbs/man/ibv_alloc_null_mr.3.md | 55 +++++++++++++++++++++++++++++++++++
>  libibverbs/verbs.c                    |  7 ++++-
>  libibverbs/verbs.h                    | 18 ++++++++++++
>  6 files changed, 90 insertions(+), 1 deletion(-)
>  create mode 100644 libibverbs/man/ibv_alloc_null_mr.3.md
> 
> diff --git a/libibverbs/driver.h b/libibverbs/driver.h
> index 43077f7..64c8757 100644
> +++ b/libibverbs/driver.h
> @@ -87,6 +87,7 @@ enum ibv_gid_type {
>  
>  enum ibv_mr_type {
>  	IBV_MR_TYPE_MR,
> +	IBV_MR_TYPE_NULL_MR,
>  };
>  
>  struct verbs_mr {
> @@ -218,6 +219,7 @@ struct verbs_context_ops {
>  	struct ibv_dm *(*alloc_dm)(struct ibv_context *context,
>  				   struct ibv_alloc_dm_attr *attr);
>  	struct ibv_mw *(*alloc_mw)(struct ibv_pd *pd, enum ibv_mw_type type);
> +	struct ibv_mr *(*alloc_null_mr)(struct ibv_pd *pd);
>  	struct ibv_pd *(*alloc_parent_domain)(
>  		struct ibv_context *context,
>  		struct ibv_parent_domain_init_attr *attr);
> diff --git a/libibverbs/dummy_ops.c b/libibverbs/dummy_ops.c
> index 1fd8f84..ddc5efe 100644
> +++ b/libibverbs/dummy_ops.c
> @@ -394,6 +394,12 @@ static struct ibv_mr *reg_dm_mr(struct ibv_pd *pd, struct ibv_dm *dm,
>  	return NULL;
>  }
>  
> +static struct ibv_mr *alloc_null_mr(struct ibv_pd *pd)
> +{
> +	errno = ENOSYS;
> +	return NULL;
> +}

These function definitions are in sorted order.

>  static struct ibv_mr *reg_mr(struct ibv_pd *pd, void *addr, size_t length,
>  			     int access)
>  {
> @@ -432,6 +438,7 @@ static int resize_cq(struct ibv_cq *cq, int cqe)
>  const struct verbs_context_ops verbs_dummy_ops = {
>  	alloc_dm,
>  	alloc_mw,
> +	alloc_null_mr,
>  	alloc_parent_domain,
>  	alloc_pd,
>  	alloc_td,
> @@ -607,6 +614,7 @@ void verbs_set_ops(struct verbs_context *vctx,
>  	SET_OP(ctx, req_notify_cq);
>  	SET_PRIV_OP(ctx, rereg_mr);
>  	SET_PRIV_OP(ctx, resize_cq);
> +	SET_OP(vctx, alloc_null_mr);

This list is sorted too.

> +
> +**ibv_alloc_null_mr()** allocates a null memory region (MR) that is associated with the protection
> +domain *pd*.
> +A null mr does not map any specific address.
> +It is used to force local HCA operations to skip the PCI bus access, while keeping track of the
> +processed length in the ibv_sge handling.
> +Meaning, instead of a PCI write access, the HCA leaves the target memory untouched,
> +and skips filling that packet section.
> +Similar behavior is done upon send, the HCA skips data which is pointed by that null MR
> +and saves PCI bus access.
> +This functionality saves PCI read/write operations and improve performance.
> +The local key field lkey is used in struct ibv_sge when posting buffers with
> +ibv_post_* verbs.
> +The ibv_mr member addr will be NULL, length will be SIZE_MAX, and the rkey will be zero, as they are irrelevant.
> +**ibv_dereg_mr()** deregisters the MR.
> +The use of ibv_rereg_mr() or ibv_bind_mw()
> +with this MR is invalid.

The above is a bit hard to read.. Suggeest

**ibv_alloc_null_mr()** allocates a null memory region (MR) that is
associated with the protection domain *pd*.

A null MR discards all data written to it, and always returns 0 on
read. It has the maximum length and oly the lkey is valid, the MR is not
exposed as an rkey.

A device should implement the null MR in a way that bypasses PCI
transfers, internally discarding or sourcing 0 data. This provides a
way to avoid PCI bus transfers by using a scatter/gather list in
commands if applications do not intend to access the data, or need
data to be 0 filled.

**ibv_dereg_mr()** deregisters the MR.  The use of ibv_rereg_mr() or
ibv_bind_mw() with this MR is invalid.

> diff --git a/libibverbs/verbs.h b/libibverbs/verbs.h
> index 83ff88c..2d04715 100644
> +++ b/libibverbs/verbs.h
> @@ -1795,6 +1795,7 @@ struct verbs_context {
>  	struct ibv_counters *(*create_counters)(struct ibv_context *context,
>  						struct ibv_counters_init_attr *init_attr);
>  	int (*destroy_counters)(struct ibv_counters *counters);
> +	struct ibv_mr *(*alloc_null_mr)(struct ibv_pd *pd);
>  	struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm,
>  				    uint64_t dm_offset, size_t length,
>  				    unsigned int access);

WOAH! What is this?  You know better.. New stuff is always at the top.

> +/*
> + * ibv_alloc_null_mr - allocate mr with special lkey
> + */

'special lkey' is mlx5 specific language, don't use it in the generic header.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux