> -----Original Message----- > From: Leon Romanovsky <leon@xxxxxxxxxx> > Sent: Monday, February 4, 2019 2:14 AM > To: Doug Ledford <dledford@xxxxxxxxxx>; Jason Gunthorpe > <jgg@xxxxxxxxxxxx> > Cc: Leon Romanovsky <leonro@xxxxxxxxxxxx>; RDMA mailing list <linux- > rdma@xxxxxxxxxxxxxxx>; Moni Shoua <monis@xxxxxxxxxxxx>; Parav Pandit > <parav@xxxxxxxxxxxx> > Subject: [PATCH rdma-next 1/2] IB/mlx5: Protect against prefetch of invalid > MR > > From: Moni Shoua <monis@xxxxxxxxxxxx> > > When deferring a prefetch request we need to protect against MR or PD > being destroyed while the request is still enqueued. > > The first step is to validate that PD owns the lkey that describes the MR and > that the MR that the lkey refers to is owned by that PD. > > The second step is to dequeue all requests when MR is destroyed. > > Since PD can't be destroyed while it owns MRs it is guaranteed that when a > worker wakes up the request it refers to is still valid. > > While that, replace the dedicated ordered workqueue with the system > unbound workqueue to reuse an existing resource and improve > performance. This will also fix a bug of queueing to the wrong workqueue. > > Fixes: 813e90b1aeaa ("IB/mlx5: Add advise_mr() support") > Reported-by: Parav Pandit <parav@xxxxxxxxxxxx> > Signed-off-by: Moni Shoua <monis@xxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > --- > diff --git a/drivers/infiniband/hw/mlx5/odp.c > b/drivers/infiniband/hw/mlx5/odp.c > index 8d32d3f91277..977d8f5d1e0f 100644 > --- a/drivers/infiniband/hw/mlx5/odp.c > +++ b/drivers/infiniband/hw/mlx5/odp.c > @@ -535,6 +535,7 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct > mlx5_ib_pd *pd, > imr->umem = umem; > init_waitqueue_head(&imr->q_leaf_free); > atomic_set(&imr->num_leaf_free, 0); > + atomic_set(&imr->num_pending_prefetch, 0); > > return imr; > } > @@ -575,6 +576,7 @@ void mlx5_ib_free_implicit_mr(struct mlx5_ib_mr > *imr) > > #define MLX5_PF_FLAGS_PREFETCH BIT(0) > #define MLX5_PF_FLAGS_DOWNGRADE BIT(1) > +#define MLX5_PF_FLAGS_PREFETCH_DEFERRED BIT(2) > static int pagefault_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr, > u64 io_virt, size_t bcnt, u32 *bytes_mapped, > u32 flags) > @@ -741,7 +743,10 @@ static int pagefault_single_data_segment(struct > mlx5_ib_dev *dev, u32 key, > u32 *bytes_mapped, u32 flags) > { > int npages = 0, srcu_key, ret, i, outlen, cur_outlen = 0, depth = 0; > - bool prefetch = flags & MLX5_PF_FLAGS_PREFETCH; > + bool prefetch = (flags & MLX5_PF_FLAGS_PREFETCH) || > + (flags & MLX5_PF_FLAGS_PREFETCH_DEFERRED); > + bool deferred = flags & MLX5_PF_FLAGS_PREFETCH_DEFERRED; > + > struct pf_frame *head = NULL, *frame; > struct mlx5_core_mkey *mmkey; > struct mlx5_ib_mr *mr; > @@ -772,6 +777,10 @@ static int pagefault_single_data_segment(struct > mlx5_ib_dev *dev, u32 key, > switch (mmkey->type) { > case MLX5_MKEY_MR: > mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); > + > + if (deferred) > + atomic_dec(&mr->num_pending_prefetch); > + Work item handler is still running on MR at this point and flush_workqueue would skip to flush. atomic_dec() should be after completing pagefault_mr(() where we are done with the MR. > if (!mr->live || !mr->ibmr.pd) { > mlx5_ib_dbg(dev, "got dead MR\n"); > ret = -EFAULT;