On Tue, Dec 18, 2018 at 02:15:56PM +0200, Leon Romanovsky wrote: > From: Huy Nguyen <huyn@xxxxxxxxxxxx> > > On NVMe offloads connection with many IO queues, EEH takes long time to > recover. The culprit is the synchronize_srcu in the destroy_mkey. Solution > is to use synchronize_srcu only for ODP mkey. > > Fixes: b4cfe447d47b ("IB/mlx5: Implement on demand paging by adding support for MMU notifiers") > Signed-off-by: Huy Nguyen <huyn@xxxxxxxxxxxx> > Reviewed-by: Daniel Jurgens <danielj@xxxxxxxxxxxx> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > --- > drivers/infiniband/hw/mlx5/mr.c | 19 ++++++++++++++++--- > 1 file changed, 16 insertions(+), 3 deletions(-) I'm going to apply this, because it does make sense to reduce the calls to synchronize_srcu, however I think this design is poor, it would be better to use call_srcu to do the cleanup/kfree rather than a full synchronize as this problem will return if there are a large number of user ODP MRs. So, I think a followup would be good. Jason