On Fri, Mar 17, 2023 at 5:53 PM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote: > > On 2023/3/17 18:17, Selvin Xavier wrote: > > On Fri, Mar 17, 2023 at 2:40 PM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote: > >> > >> On 2023/3/15 16:16, Selvin Xavier wrote: > >>> Add resize_cq verb support for user space CQs. Resize operation for > >>> kernel CQs are not supported now. > >>> > >>> Driver should free the current CQ only after user library polls > >>> for all the completions and switch to new CQ. So after the resize_cq > >>> is returned from the driver, user libray polls for existing completions > >> > >> libray -> library > >> > >>> and store it as temporary data. Once library reaps all completions in the > >>> current CQ, it invokes the ibv_cmd_poll_cq to inform the driver about > >>> the resize_cq completion. Adding a check for user CQs in driver's > >>> poll_cq and complete the resize operation for user CQs. > >>> Updating uverbs_cmd_mask with poll_cq to support this. > >>> > >>> User library changes are available in this pull request. > >>> https://github.com/linux-rdma/rdma-core/pull/1315 > >>> > >>> Signed-off-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> > >>> --- > >>> drivers/infiniband/hw/bnxt_re/ib_verbs.c | 109 +++++++++++++++++++++++++++++++ > >>> drivers/infiniband/hw/bnxt_re/ib_verbs.h | 3 + > >>> drivers/infiniband/hw/bnxt_re/main.c | 2 + > >>> drivers/infiniband/hw/bnxt_re/qplib_fp.c | 44 +++++++++++++ > >>> drivers/infiniband/hw/bnxt_re/qplib_fp.h | 5 ++ > >>> include/uapi/rdma/bnxt_re-abi.h | 4 ++ > >>> 6 files changed, 167 insertions(+) > >>> > >>> diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c > >>> index 989edc7..e86afec 100644 > >>> --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c > >>> +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c > >>> @@ -2912,6 +2912,106 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, > >>> return rc; > >>> } > >>> > >>> +static void bnxt_re_resize_cq_complete(struct bnxt_re_cq *cq) > >>> +{ > >>> + struct bnxt_re_dev *rdev = cq->rdev; > >>> + > >>> + bnxt_qplib_resize_cq_complete(&rdev->qplib_res, &cq->qplib_cq); > >>> + > >>> + cq->qplib_cq.max_wqe = cq->resize_cqe; > >>> + if (cq->resize_umem) { > >>> + ib_umem_release(cq->umem); > >>> + cq->umem = cq->resize_umem; > >>> + cq->resize_umem = NULL; > >>> + cq->resize_cqe = 0; > >>> + } > >>> +} > >>> + > >>> +int bnxt_re_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata) > >>> +{ > >>> + struct bnxt_qplib_sg_info sg_info = {}; > >>> + struct bnxt_qplib_dpi *orig_dpi = NULL; > >>> + struct bnxt_qplib_dev_attr *dev_attr; > >>> + struct bnxt_re_ucontext *uctx = NULL; > >>> + struct bnxt_re_resize_cq_req req; > >>> + struct bnxt_re_dev *rdev; > >>> + struct bnxt_re_cq *cq; > >>> + int rc, entries; > >>> + > >>> + cq = container_of(ibcq, struct bnxt_re_cq, ib_cq); > >>> + rdev = cq->rdev; > >>> + dev_attr = &rdev->dev_attr; > >>> + if (!ibcq->uobject) { > >>> + ibdev_err(&rdev->ibdev, "Kernel CQ Resize not supported"); > >>> + return -EOPNOTSUPP; > >>> + } > >>> + > >>> + if (cq->resize_umem) { > >>> + ibdev_err(&rdev->ibdev, "Resize CQ %#x failed - Busy", > >>> + cq->qplib_cq.id); > >>> + return -EBUSY; > >>> + } > >> > >> Does above cq->resize_umem checking has any conconcurrent protection > >> again the bnxt_re_resize_cq_complete() called by bnxt_re_poll_cq()? > >> > >> bnxt_re_resize_cq() seems like a control path operation, while > >> bnxt_re_poll_cq() seems like a data path operation, I am not sure > >> there is any conconcurrent protection between them. > >> > > The previous check is to prevent simultaneous resize_cq context from > > the user application. > > > > if you see the library implementation (PR > > https://github.com/linux-rdma/rdma-core/pull/1315), entire operation > > is done in the single resize_cq context from application > > i.e. > > bnxt_re_resize_cq > > -> ibv_cmd_resize_cq > > call driver bnxt_re_resize_cq and return > > -> poll out the current completions and store it in a user lib list > > -> issue an ibv_cmd_poll_cq. > > This will invoke bnxt_re_poll_cq in the kernel driver. We free > > the previous cq resources. > > > > So the synchronization between resize_cq and poll_cq is happening in > > the user lib. We can free the old CQ only after user land sees the > > final Completion on the previous CQ. So we return from the driver's > > bnxt_re_resize_cq and then poll for all completions and then use > > ibv_cmd_poll_cq as a hook to reach the kernel driver and free the old > > CQ's resource. > > > > Summarize, driver's bnxt_re_poll_cq for user space CQs will be called > > only from resize_cq context and no concurrent protection required in > > the driver. > > Is it acceptable to depend on the user space code to ensure the correct > order in the kernel space? > Isn't that may utilized by some malicious user to crash the kernel? > The user space code I mentioned is implemented in the provider/bnxt_re user library just like any other verb interface. Apps shall always go through this library to send request to the provider driver.
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature