Re: [PATCH for-next] RDMA/vmw_pvrdma: Use resource ids from physical device if available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/20/19 5:25 AM, Yuval Shaia wrote:
> On Thu, May 16, 2019 at 06:24:48PM +0000, Adit Ranadive wrote:
>> From: Bryan Tan <bryantan@xxxxxxxxxx>
>>
>> This change allows the RDMA stack to use physical resource numbers if
>> they are passed up from the device. Doing so allows communication with
>> physical non-ESX endpoints (such as a bare-metal Linux machine or a
>> SR-IOV-enabled VM).
> 
> With the price of not supporting migration, right :)

I'm actually going to remove that in v1 since it's a bit misleading :).

> 
>>
>> This is accomplished by separating the concept of the QP number from
>> the QP handle. Previously, the two were the same, as the QP number was
>> exposed to the guest and also used to reference a virtual QP in the
>> device backend. With physical resource numbers exposed, the QP number
>> given to the guest is the QP number assigned to the physical HCA's QP,
> 
> s/assigned to the/assigned by the
> 
>> while the QP handle is still the internal handle used to reference a
>> virtual QP. Regardless of whether the device is exposing physical ids,
>> the driver will still try to pick up the QP handle from the backend if
>> possible. The MR keys exposed to the guest will also be the MR keys
>> created by the physical HCA, instead of virtual MR keys.
> 
> I don't see any thing related to MR in the patch so wondering if you are
> talking about some future patch in your pipe.
> In any case - suggesting to remove this comment.

We already make the distinction for MR in terms of the handle and the keys
so we don't need to do anything special for them.

> 
>>
>> A new version of the create QP response has been added to the device
>> API. The device backend will pass the QP handle up to the driver, if
>> both the device and driver are at the appriopriate version, and the ABI
>> has also been updated to allow the return of the QP handle to the guest
>> library. The PVRDMA version and ABI version have been bumped up because
>> of these non-compatible changes.
> 
> I have to admit i'm not fully understand why this patch is needed, maybe i
> have to read the relevant library patch to have a complete picture but what
> i'm missing here is why it is needed, after all the QEMU device (which is
> also derived by this driver) pass the physical QPN to guest from day one
> and has no issues communicating with any peer, virtual or bare-metal.
> 
> As i see it, from a design perspectives, the driver, as well as the
> library, should be agnostic to how the device allocates its QPN.
> 
> I might be missing something here so will be happy if you can elaborate a
> bit more on that.
> 

Well we're moving from using hypervisor-generated resources ids to hardware-
generated ones so we need to make the distinction between the handle
(hypervisor-generated) and the QPN (hardware-generated) which is currently 
missing from our driver/library. We probably don't need the ABI bump like Jason
mentioned but we do need to make the distinction to allow applications to 
modify QP/post WQEs correctly.

> Yuval
> 
>>
>> Reviewed-by: Jorgen Hansen <jhansen@xxxxxxxxxx>
>> Signed-off-by: Adit Ranadive <aditr@xxxxxxxxxx>
>> Signed-off-by: Bryan Tan <bryantan@xxxxxxxxxx>
>> ---
>>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h | 15 +++++++++++++-
>>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c    |  8 +++++++-
>>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c      | 24 +++++++++++++++++++++--
>>  include/uapi/rdma/vmw_pvrdma-abi.h                |  9 ++++++++-
>>  4 files changed, 51 insertions(+), 5 deletions(-)
>> ---
>>
>> The PR for userspace was sent:
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flinux-rdma%2Frdma-core%2Fpull%2F531&amp;data=02%7C01%7Caditr%40vmware.com%7C4b956518947746ee5a1108d6dd1e6579%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636939520083195677&amp;sdata=J28i69JunGocJGuGefqBlCHC1u7hb%2Fqjq9%2FXCPPvNRM%3D&amp;reserved=0
>>
>> ---
>> diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h
>> index 8f9749d54688..86a6c054ea26 100644
>> --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h
>> +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h
>> @@ -58,7 +58,8 @@
>>  #define PVRDMA_ROCEV1_VERSION		17
>>  #define PVRDMA_ROCEV2_VERSION		18
>>  #define PVRDMA_PPN64_VERSION		19
>> -#define PVRDMA_VERSION			PVRDMA_PPN64_VERSION
>> +#define PVRDMA_QPHANDLE_VERSION		20
>> +#define PVRDMA_VERSION			PVRDMA_QPHANDLE_VERSION
>>  
>>  #define PVRDMA_BOARD_ID			1
>>  #define PVRDMA_REV_ID			1
>> @@ -581,6 +582,17 @@ struct pvrdma_cmd_create_qp_resp {
>>  	u32 max_inline_data;
>>  };
>>  
>> +struct pvrdma_cmd_create_qp_resp_v2 {
>> +	struct pvrdma_cmd_resp_hdr hdr;
>> +	u32 qpn;
>> +	u32 qp_handle;
>> +	u32 max_send_wr;
>> +	u32 max_recv_wr;
>> +	u32 max_send_sge;
>> +	u32 max_recv_sge;
>> +	u32 max_inline_data;
>> +};
>> +
>>  struct pvrdma_cmd_modify_qp {
>>  	struct pvrdma_cmd_hdr hdr;
>>  	u32 qp_handle;
>> @@ -663,6 +675,7 @@ struct pvrdma_cmd_destroy_bind {
>>  	struct pvrdma_cmd_create_cq_resp create_cq_resp;
>>  	struct pvrdma_cmd_resize_cq_resp resize_cq_resp;
>>  	struct pvrdma_cmd_create_qp_resp create_qp_resp;
>> +	struct pvrdma_cmd_create_qp_resp_v2 create_qp_resp_v2;
>>  	struct pvrdma_cmd_query_qp_resp query_qp_resp;
>>  	struct pvrdma_cmd_destroy_qp_resp destroy_qp_resp;
>>  	struct pvrdma_cmd_create_srq_resp create_srq_resp;
>> diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
>> index 40182297f87f..02e337837a2e 100644
>> --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
>> +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
>> @@ -201,7 +201,13 @@ static int pvrdma_register_device(struct pvrdma_dev *dev)
>>  	dev->ib_dev.owner = THIS_MODULE;
>>  	dev->ib_dev.num_comp_vectors = 1;
>>  	dev->ib_dev.dev.parent = &dev->pdev->dev;
>> -	dev->ib_dev.uverbs_abi_ver = PVRDMA_UVERBS_ABI_VERSION;
>> +
>> +	if (dev->dsr_version >= PVRDMA_QPHANDLE_VERSION)
>> +		dev->ib_dev.uverbs_abi_ver = PVRDMA_UVERBS_ABI_VERSION;
>> +	else
>> +		dev->ib_dev.uverbs_abi_ver =
>> +				PVRDMA_UVERBS_NO_QP_HANDLE_ABI_VERSION;
>> +
>>  	dev->ib_dev.uverbs_cmd_mask =
>>  		(1ull << IB_USER_VERBS_CMD_GET_CONTEXT)		|
>>  		(1ull << IB_USER_VERBS_CMD_QUERY_DEVICE)	|
>> diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c
>> index 0eaaead5baec..8cba7623f379 100644
>> --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c
>> +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c
>> @@ -195,7 +195,9 @@ struct ib_qp *pvrdma_create_qp(struct ib_pd *pd,
>>  	union pvrdma_cmd_resp rsp;
>>  	struct pvrdma_cmd_create_qp *cmd = &req.create_qp;
>>  	struct pvrdma_cmd_create_qp_resp *resp = &rsp.create_qp_resp;
>> +	struct pvrdma_cmd_create_qp_resp_v2 *resp_v2 = &rsp.create_qp_resp_v2;
>>  	struct pvrdma_create_qp ucmd;
>> +	struct pvrdma_create_qp_resp qp_resp = {};
>>  	unsigned long flags;
>>  	int ret;
>>  	bool is_srq = !!init_attr->srq;
>> @@ -379,13 +381,31 @@ struct ib_qp *pvrdma_create_qp(struct ib_pd *pd,
>>  	}
>>  
>>  	/* max_send_wr/_recv_wr/_send_sge/_recv_sge/_inline_data */
>> -	qp->qp_handle = resp->qpn;
>>  	qp->port = init_attr->port_num;
>> -	qp->ibqp.qp_num = resp->qpn;
>> +	if (dev->dsr_version >= PVRDMA_QPHANDLE_VERSION) {
>> +		qp->ibqp.qp_num = resp_v2->qpn;
>> +		qp->qp_handle = resp_v2->qp_handle;
>> +	} else {
>> +		qp->ibqp.qp_num = resp->qpn;
>> +		qp->qp_handle = resp->qpn;
>> +	}
>> +
>>  	spin_lock_irqsave(&dev->qp_tbl_lock, flags);
>>  	dev->qp_tbl[qp->qp_handle % dev->dsr->caps.max_qp] = qp;
>>  	spin_unlock_irqrestore(&dev->qp_tbl_lock, flags);
>>  
>> +	if (!qp->is_kernel) {
>> +		/* Copy udata back. */
>> +		qp_resp.qpn = qp->ibqp.qp_num;
>> +		qp_resp.qp_handle = qp->qp_handle;
>> +		if (ib_copy_to_udata(udata, &qp_resp, sizeof(qp_resp))) {
>> +			dev_warn(&dev->pdev->dev,
>> +				 "failed to copy back udata\n");
>> +			pvrdma_destroy_qp(&qp->ibqp, udata);
>> +			return ERR_PTR(-EINVAL);
>> +		}
>> +	}
>> +
>>  	return &qp->ibqp;
>>  
>>  err_pdir:
>> diff --git a/include/uapi/rdma/vmw_pvrdma-abi.h b/include/uapi/rdma/vmw_pvrdma-abi.h
>> index 6e73f0274e41..8ebab11dadcb 100644
>> --- a/include/uapi/rdma/vmw_pvrdma-abi.h
>> +++ b/include/uapi/rdma/vmw_pvrdma-abi.h
>> @@ -49,7 +49,9 @@
>>  
>>  #include <linux/types.h>
>>  
>> -#define PVRDMA_UVERBS_ABI_VERSION	3		/* ABI Version. */
>> +#define PVRDMA_UVERBS_NO_QP_HANDLE_ABI_VERSION	3
>> +#define PVRDMA_UVERBS_ABI_VERSION		4	/* ABI Version. */
>> +
>>  #define PVRDMA_UAR_HANDLE_MASK		0x00FFFFFF	/* Bottom 24 bits. */
>>  #define PVRDMA_UAR_QP_OFFSET		0		/* QP doorbell. */
>>  #define PVRDMA_UAR_QP_SEND		(1 << 30)	/* Send bit. */
>> @@ -179,6 +181,11 @@ struct pvrdma_create_qp {
>>  	__aligned_u64 qp_addr;
>>  };
>>  
>> +struct pvrdma_create_qp_resp {
>> +	__u32 qpn;
>> +	__u32 qp_handle;
>> +};
>> +
>>  /* PVRDMA masked atomic compare and swap */
>>  struct pvrdma_ex_cmp_swap {
>>  	__aligned_u64 swap_val;
>> -- 
>> 1.8.3.1
>>





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux