Re: [PATCH rdma-rc 2/4] IB/mlx5: Retry cache population when resource is temporarily unavailable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 26, 2018 at 08:49:00PM +0000, Jason Gunthorpe wrote:
> On Sun, Nov 25, 2018 at 08:34:24PM +0200, Leon Romanovsky wrote:
> > diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> > index 9b195d65a13e..f600623ce3f7 100644
> > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > @@ -480,7 +480,8 @@ struct mlx5_ib_mr *mlx5_mr_cache_alloc(struct mlx5_ib_dev *dev, int entry)
> >  			if (err && err != -EAGAIN)
> >  				return ERR_PTR(err);
> >
> > -			wait_for_completion(&ent->compl);
> > +			wait_for_completion_timeout(&ent->compl,
> > +						    msecs_to_jiffies(20));
>
> I think we should revisit this when the threading here is fixed, as I
> remarked in the prior ODP patches.
>
> Adding a delay like this seems like it will result in overfilling the
> cache in some situations, and is incredibly ugly.


Jason,

wait_for_completion_timeout() is not a delay but a way to limit wait if
completion is not arrived to something shorter than MAX_SCHEDULE_TIMEOUT,
which is ULONG_MAX. In some rare situations, it will lead to complete
halted system.

Please see kernel/sched/completion.c:

134 void __sched wait_for_completion(struct completion *x)
135 {
136         wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_UNINTERRUPTIBLE);
137 }
138 EXPORT_SYMBOL(wait_for_completion);

....

152 unsigned long __sched
153 wait_for_completion_timeout(struct completion *x, unsigned long timeout)
154 {
155         return wait_for_common(x, timeout, TASK_UNINTERRUPTIBLE);
156 }
157 EXPORT_SYMBOL(wait_for_completion_timeout);

Thanks

>
> Jason

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux