On Wed, Sep 20, 2023 at 01:36:37PM -0300, Jason Gunthorpe wrote: > On Wed, Sep 20, 2023 at 12:54:56PM +0300, Leon Romanovsky wrote: > > From: Shay Drory <shayd@xxxxxxxxxx> > > > > Currently, mkeys are managed via xarray. This implementation leads to > > a degradation in cases many MRs are unregistered in parallel, due to xarray > > internal implementation, for example: deregistration 1M MRs via 64 threads > > is taking ~15% more time[1]. > > > > Hence, implement mkeys management via LIFO queue, which solved the > > degradation. > > > > [1] > > 2.8us in kernel v5.19 compare to 3.2us in kernel v6.4 > > > > Signed-off-by: Shay Drory <shayd@xxxxxxxxxx> > > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx> > > --- > > drivers/infiniband/hw/mlx5/mlx5_ib.h | 19 +- > > drivers/infiniband/hw/mlx5/mr.c | 324 ++++++++++++--------------- > > drivers/infiniband/hw/mlx5/umr.c | 4 +- > > 3 files changed, 167 insertions(+), 180 deletions(-) > > > > diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h > > index 16713baf0d06..261c86fe6433 100644 > > --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h > > +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h > > @@ -753,10 +753,23 @@ struct umr_common { > > unsigned int state; > > }; > > > > +#define NUM_MKEYS_PER_PAGE (PAGE_SIZE / sizeof(u32)) > > + > > +struct mlx5_mkeys_page { > > + u32 mkeys[NUM_MKEYS_PER_PAGE]; > > + struct list_head list; > > +}; > > Er, isn't the point of this to be PAGE_SIZE big? The more important part is preallocation of whole struct mlx5_mkeys_page to hold multiple keys in one shot. The PAGE_SIZE alignment can definitely help to make it even more efficient, but it is not culprit of this patch. I will change. Thanks > > Add an > > static_assert(sizeof(struct mlx5_mkeys_page) == PAGE_SIZE) > > And fix it so it is true.. > > Jason