Re: [PATCH rdma-next v1 0/3] Introduce new advise MR verb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 19, 2018 at 09:02:51AM +0200, Leon Romanovsky wrote:
> On Tue, Dec 18, 2018 at 03:27:30PM -0700, Jason Gunthorpe wrote:
> > On Tue, Dec 11, 2018 at 01:37:50PM +0200, Leon Romanovsky wrote:
> > > From: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> > >
> > > Changelog v0->v1:
> > >  * Fixed commit message in patch 2
> > >  * Removed redundant brackets
> > >  * Add FIXME comment
> > >  * Flush workqueue to ensure no work is executed during ib_device dereg
> > >  * Change declaration of sg_list ot be flex array
> > >  * Fix rebase error
> > >
> > > Hi,
> > >
> > > In this series from Moni, we are implementing the new advise_mr()
> > > verb, which was proposed as RFC [1].
> > >
> > > The verb advise_mr() borrows its definition from the system call
> > > madvise() by giving an advice to the driver about an address range
> > > that belongs to a memory region (MR), in opposite to madvise() which
> > > operates on addresses and has different logical semantics not suitable
> > > for MRs.
> > >
> > > This verb is used by applications to tell the kernel about expected
> > > memory usage to efficiently prepare it in advance, prior any following
> > > usage. Like with madvise(), the advise_mr verb does not interfere
> > > the semantics of the application, but can improve application performance.
> > >
> > > Being an advice, the kernel is free to ignore advise_mr() calls.
> > >
> > > Important example of such performance improvement hint is partial
> > > pre-fetching of an ODP MRs.
> > >
> > > Such pre-fetched ODP address ensure that range is exist before the actual
> > > IO is conducted. This would provide a way to reduce latency by overlapping
> > > paging-in and either compute time or IO to other ranges.
> > >
> > > Thanks
> > >
> > > [1] https://www.spinics.net/lists/linux-rdma/msg70592.html
> > >
> > > This series has merge conflict with commit: 4d5422a309de
> > > ("IB/mlx5: Skip non-ODP MR when handling a page fault") in rdma-rc.
> > >
> > > The resolution is as follow:
> > > 1. It is an error to ask "prefetch" for non-ODP MRs, because it came from explicit request.
> > > 2. It is OK to have non-ODP MRs in page-faults.
> > >
> > > +               if (prefetch && !mr->umem->is_odp) {
> > > +                       ret = -EINVAL;
> > > +                       goto srcu_unlock;
> > > +               }
> > > +
> > >  +              if (!mr->umem->is_odp) {
> > >  +                      mlx5_ib_dbg(dev, "skipping non ODP MR (lkey=0x%06x) in page fault handler.\n",
> > >  +                                  key);
> > >  +                      if (bytes_mapped)
> > >  +                              *bytes_mapped += bcnt;
> > >  +                      ret = 0;
> > >  +                      goto srcu_unlock;
> > >  +              }
> > >
> > > Moni Shoua (3):
> > >   IB/uverbs: Add helper to get array size from ptr attribute
> > >   IB/uverbs: Add support to advise_mr
> > >   IB/mlx5: Add advise_mr() support
> >
> > Applied to for next. I had to rebase it over the ops patches, which
> > wasn't hard, but please check my work.
>
> Thanks, It looks good.

Jason,

I continued to test it and found that compilation without CONFIG_INFINIBAND_ON_DEMAND_PAGING
created bunch of warnings about unused function.

This diff, which is far from my original intent fixes it.

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index f245b5d8a3bc..4307d2ebf467 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1207,14 +1207,6 @@ static inline void mlx5_odp_init_mr_cache_entry(struct mlx5_cache_ent *ent) {}
 static inline void mlx5_odp_populate_klm(struct mlx5_klm *pklm, size_t offset,
 					 size_t nentries, struct mlx5_ib_mr *mr,
 					 int flags) {}
-
-static int mlx5_ib_advise_mr_prefetch(struct ib_pd *pd,
-				      enum ib_uverbs_advise_mr_advice advice,
-				      u32 flags, struct ib_sge *sg_list,
-				      u32 num_sge)
-{
-	return -EOPNOTSUPP;
-}
 #endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */

 /* Needed for rep profile */
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 1bd8c1b1dba1..a62ff8131e0c 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1293,6 +1293,7 @@ static struct ib_mr *mlx5_ib_get_memic_mr(struct ib_pd *pd, u64 memic_addr,
 	return ERR_PTR(err);
 }

+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 int mlx5_ib_advise_mr(struct ib_pd *pd,
 		      enum ib_uverbs_advise_mr_advice advice,
 		      u32 flags,
@@ -1307,6 +1308,7 @@ int mlx5_ib_advise_mr(struct ib_pd *pd,
 	return mlx5_ib_advise_mr_prefetch(pd, advice, flags,
 					 sg_list, num_sge);
 }
+#endif

 struct ib_mr *mlx5_ib_reg_dm_mr(struct ib_pd *pd, struct ib_dm *dm,
 				struct ib_dm_mr_attr *attr,
>
> >
> > Thanks,
> > Jason


Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux