On Mon, Nov 26, 2018 at 08:49:00PM +0000, Jason Gunthorpe wrote: > On Sun, Nov 25, 2018 at 08:34:24PM +0200, Leon Romanovsky wrote: > > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c > > index 9b195d65a13e..f600623ce3f7 100644 > > +++ b/drivers/infiniband/hw/mlx5/mr.c > > @@ -480,7 +480,8 @@ struct mlx5_ib_mr *mlx5_mr_cache_alloc(struct mlx5_ib_dev *dev, int entry) > > if (err && err != -EAGAIN) > > return ERR_PTR(err); > > > > - wait_for_completion(&ent->compl); > > + wait_for_completion_timeout(&ent->compl, > > + msecs_to_jiffies(20)); > > I think we should revisit this when the threading here is fixed, as I > remarked in the prior ODP patches. > > Adding a delay like this seems like it will result in overfilling the > cache in some situations, and is incredibly ugly. Jason, wait_for_completion_timeout() is not a delay but a way to limit wait if completion is not arrived to something shorter than MAX_SCHEDULE_TIMEOUT, which is ULONG_MAX. In some rare situations, it will lead to complete halted system. Please see kernel/sched/completion.c: 134 void __sched wait_for_completion(struct completion *x) 135 { 136 wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_UNINTERRUPTIBLE); 137 } 138 EXPORT_SYMBOL(wait_for_completion); .... 152 unsigned long __sched 153 wait_for_completion_timeout(struct completion *x, unsigned long timeout) 154 { 155 return wait_for_common(x, timeout, TASK_UNINTERRUPTIBLE); 156 } 157 EXPORT_SYMBOL(wait_for_completion_timeout); Thanks > > Jason
Attachment:
signature.asc
Description: PGP signature