Re: [PATCH for-next] RDMA/rxe: Fix memory ordering problem for resize cq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 25, 2021 at 06:12:46PM -0500, Pearson, Robert B wrote:
> 
> On 5/25/2021 6:03 PM, Jason Gunthorpe wrote:
> > On Tue, May 25, 2021 at 03:11:12PM -0500, Bob Pearson wrote:
> > > The rxe driver has recently begun exhibiting failures in the python
> > > tests that are due to stale values read from the completion queue
> > > producer or consumer indices. Unlike the other loads of these shared
> > > indices those in queue_count() were not protected by smp_load_acquire().
> > > 
> > > This patch replaces loads by smp_load_acquire() in queue_count().
> > > The observed errors no longer occur.
> > > 
> > > Reported-by: Zhu Yanjun <zyj2020@xxxxxxxxx>
> > > Fixes: d21a1240f516 ("RDMA/rxe: Use acquire/release for memory ordering")
> > > Signed-off-by: Bob Pearson <rpearsonhpe@xxxxxxxxx>
> > >   drivers/infiniband/sw/rxe/rxe_queue.h | 18 ++++++++++++++++--
> > >   1 file changed, 16 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/sw/rxe/rxe_queue.h b/drivers/infiniband/sw/rxe/rxe_queue.h
> > > index 2902ca7b288c..5cb142282fa6 100644
> > > +++ b/drivers/infiniband/sw/rxe/rxe_queue.h
> > > @@ -161,8 +161,22 @@ static inline unsigned int index_from_addr(const struct rxe_queue *q,
> > >   static inline unsigned int queue_count(const struct rxe_queue *q)
> > >   {
> > > -	return (q->buf->producer_index - q->buf->consumer_index)
> > > -		& q->index_mask;
> > > +	u32 prod;
> > > +	u32 cons;
> > > +	u32 count;
> > > +
> > > +	/* make sure all changes to queue complete before
> > > +	 * changing producer index
> > > +	 */
> > > +	prod = smp_load_acquire(&q->buf->producer_index);
> > > +
> > > +	/* make sure all changes to queue complete before
> > > +	 * changing consumer index
> > > +	 */
> > > +	cons = smp_load_acquire(&q->buf->consumer_index);
> > > +	count = (prod - cons) % q->index_mask;
> > > +
> > > +	return count;
> > >   }
> > It doesn't quite make sense to load both?
> > 
> > Only the one written by userspace should require a load_acquire, the
> > one written by the kernel should already be locked somehow with a
> > kernel lock or there is a different problem
> > 
> > But yes, this does look like a bug to me
> > 
> > Jason
> 
> The rxe_queue.h API is used by completion and work queues so both directions
> occur. The question is the trade off between the logic to avoid the
> load_acquire vs the branching to get around. Same point could be applied to
> queue_empty() which already had both load_acquire() calls. Is it worth
> creating several versions of APIs with different choices made?

Hard to say.. I would probably have made special types for 'user
producer' and 'user consumer' queues and had actually different
behavior - the kernel controlled data should be shadowed in the kernel
and only copied to the user - have to consider the user corrupting the
data, for instance.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux