On Wed, Dec 07, 2022 at 10:57:51AM +0200, Michael Guralnik wrote: > Currently, when dereging an MR, if the mkey doesn't belong to a cache > entry, it will be destroyed. > As a result, the restart of applications with many non-cached mkeys is > not efficient since all the mkeys are destroyed and then recreated. > This process takes a long time (for 100,000 MRs, it is ~20 seconds for > dereg and ~28 seconds for re-reg). > > To shorten the restart runtime, insert all cacheable mkeys to the cache. > If there is no fitting entry to the mkey properties, create a temporary > entry that fits it. > > After a predetermined timeout, the cache entries will shrink to the > initial high limit. > > The mkeys will still be in the cache when consuming them again after an > application restart. Therefore, the registration will be much faster > (for 100,000 MRs, it is ~4 seconds for dereg and ~5 seconds for re-reg). > > The temporary cache entries created to store the non-cache mkeys are not > exposed through sysfs like the default cache entries. > > Signed-off-by: Michael Guralnik <michaelgur@xxxxxxxxxx> > --- > drivers/infiniband/hw/mlx5/mlx5_ib.h | 24 ++++++------ > drivers/infiniband/hw/mlx5/mr.c | 55 +++++++++++++++++++++------- > 2 files changed, 55 insertions(+), 24 deletions(-) > > diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h > index be6d9ec5b127..8f0faa6bc9b5 100644 > --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h > +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h > @@ -617,12 +617,25 @@ enum mlx5_mkey_type { > MLX5_MKEY_INDIRECT_DEVX, > }; > > +struct mlx5r_cache_rb_key { > + u8 ats:1; > + unsigned int access_mode; > + unsigned int access_flags; > + /* > + * keep ndescs as the last member so entries with about the same ndescs > + * will be close in the tree > + */ > + unsigned int ndescs; > +}; > + > struct mlx5_ib_mkey { > u32 key; > enum mlx5_mkey_type type; > unsigned int ndescs; > struct wait_queue_head wait; > refcount_t usecount; > + /* User Mkey must hold either a cache_key or a cache_ent. */ > + struct mlx5r_cache_rb_key rb_key; What is a cache_key? Why do we now have ndecs and rb_key.ndescs in the same struct? > struct mlx5_cache_ent *cache_ent; > }; > > @@ -731,17 +744,6 @@ struct umr_common { > unsigned int state; > }; > > -struct mlx5r_cache_rb_key { > - u8 ats:1; > - unsigned int access_mode; > - unsigned int access_flags; > - /* > - * keep ndescs as the last member so entries with about the same ndescs > - * will be close in the tree > - */ > - unsigned int ndescs; > -}; Don't move this, put it where it needs to be in the earlier patch > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c > index 6531e38ef4ec..2e984d436ad5 100644 > --- a/drivers/infiniband/hw/mlx5/mr.c > +++ b/drivers/infiniband/hw/mlx5/mr.c > @@ -1096,15 +1096,14 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd, > rb_key.access_flags = get_unchangeable_access_flags(dev, access_flags); > ent = mkey_cache_ent_from_rb_key(dev, rb_key); > /* > - * Matches access in alloc_cache_mr(). If the MR can't come from the > - * cache then synchronously create an uncached one. > + * If the MR can't come from the cache then synchronously create an uncached > + * one. > */ > - if (!ent || ent->limit == 0 || > - !mlx5r_umr_can_reconfig(dev, 0, access_flags) || > - mlx5_umem_needs_ats(dev, umem, access_flags)) { > + if (!ent) { > mutex_lock(&dev->slow_path_mutex); > mr = reg_create(pd, umem, iova, access_flags, page_size, false); > mutex_unlock(&dev->slow_path_mutex); > + mr->mmkey.rb_key = rb_key; > return mr; > } Does this belong in this patch? Maybe these cleanups need their own patch Jason