RE: [PATCH rdma-next v1 5/6] RDMA/mlx5: Change check for cacheable user mkeys

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/01/2024 19:52, Jason Gunthorpe wrote:
On Sun, Jan 28, 2024 at 11:29:15AM +0200, Leon Romanovsky wrote:
From: Or Har-Toov <ohartoov@xxxxxxxxxx>

In the dereg flow, UMEM is not a good enough indication whether an MR
is from userspace since in mlx5_ib_rereg_user_mr there are some cases
when a new MR is created and the UMEM of the old MR is set to NULL.
Why is this a problem though? The only thing the umem has to do is to
trigger the UMR optimization. If UMR is not triggered then the mkey is
destroyed and it shouldn't be part of the cache at all.

The problem is that it doesn't trigger the UMR on mkeys that are dereged
from the rereg flow.
Optimally, we'd want them to return to the cache, if possible.

We can keep relying on the UMEM to decide whether we want to try to return
them to cache, as you suggested in the revoke_mr() below, but that way those
mkeys will not return to the cache and we have to deal with the in_use in
the revoke flow.

@@ -1914,7 +1913,9 @@ int mlx5_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
ib_umem_stop_invalidation_notifier(mr->umem);
}
- if (!mr->mmkey.cache_ent) {
+ /* Stop DMA */
+ if (!is_cacheable_mkey(&mr->mmkey) || mlx5r_umr_revoke_mr(mr) ||
+ cache_ent_find_and_store(dev, mr)) {
And now the mlx5r_umr_can_load_pas() can been lost, that isn't good. A
non-umr-able object should never be placed in the cache. If the mkey's
size is too big it has to be freed normally.

mlx5r_can_load_pas() will not get lost since mkeys that are not-umr-able will not have rb_key or cache_ent set so is_cacheable_mkey is always false for them.

rc = destroy_mkey(to_mdev(mr->ibmr.device), mr);
if (rc)
return rc;
I'm not sure it is right to re-order this? The revokation of a mkey
should be a single operation, which ever path we choose to take..

Regardless the upstream code doesn't have this ordering so it should
all be one sequence of revoking the mkey and synchronizing the cache.

I suggest to put the revoke sequence into one function:

static int mlx5_revoke_mr(struct mlx5_ib_mr *mr)
{
struct mlx5_ib_dev *dev = to_mdev(mr->ibmr.device);

if (mr->umem && mlx5r_umr_can_load_pas(dev, mr->umem->length)) {
if (mlx5r_umr_revoke_mr(mr))
goto destroy;

if (cache_ent_find_and_store(dev, mr))
goto destroy;
return 0;
}

destroy:
if (mr->mmkey.cache_ent) {
spin_lock_irq(&mr->mmkey.cache_ent->mkeys_queue.lock);
mr->mmkey.cache_ent->in_use--;
mr->mmkey.cache_ent = NULL;
spin_unlock_irq(&mr->mmkey.cache_ent->mkeys_queue.lock);
}
return destroy_mkey(dev, mr);
}

(notice we probably shouldn't set cache_ent to null without adjusting in_use)
Jason




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux