Re: [PATCH v1 for-rc] RDMA/vmw_pvrdma: Report CQ missed events

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 10, 2017 at 08:55:42PM +0000, Adit Ranadive wrote:
> On 8/10/17, 1:02 PM, "Yuval Shaia" <yuval.shaia@xxxxxxxxxx> wrote:
> > On Thu, Aug 10, 2017 at 12:05:02PM -0700, Adit Ranadive wrote:
> > > From: Bryan Tan <bryantan@xxxxxxxxxx>
> > > 
> > > There is a chance of a race between arming the CQ and receiving
> > > completions. By reporting CQ missed events any ULPs should poll
> > > again to get the completions.
> > > 
> > > Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver")
> > > Acked-by: Aditya Sarwade <asarwade@xxxxxxxxxx>
> > > Signed-off-by: Bryan Tan <bryantan@xxxxxxxxxx>
> > > Signed-off-by: Adit Ranadive <aditr@xxxxxxxxxx>
> > > ---
> > > v0 -> v1:
> > >  - Check for invalid ring index.
> > > ---
> > >  drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c | 17 ++++++++++++++++-
> > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> > > index 69bda61..90aa326 100644
> > > --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> > > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> > > @@ -65,13 +65,28 @@ int pvrdma_req_notify_cq(struct ib_cq *ibcq,
> > >  	struct pvrdma_dev *dev = to_vdev(ibcq->device);
> > >  	struct pvrdma_cq *cq = to_vcq(ibcq);
> > >  	u32 val = cq->cq_handle;
> > > +	unsigned long flags;
> > > +	int has_data = 0;
> > >  
> > >  	val |= (notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED ?
> > >  		PVRDMA_UAR_CQ_ARM_SOL : PVRDMA_UAR_CQ_ARM;
> > >  
> > > +	spin_lock_irqsave(&cq->cq_lock, flags);
> > > +
> > >  	pvrdma_write_uar_cq(dev, val);
> > >  
> > > -	return 0;
> > > +	if (notify_flags & IB_CQ_REPORT_MISSED_EVENTS) {
> > > +		unsigned int head;
> > > +
> > > +		has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
> > > +						    cq->ibcq.cqe, &head);
> > > +		if (unlikely(has_data == PVRDMA_INVALID_IDX))
> > > +			dev_err(&dev->pdev->dev, "CQ ring state invalid\n");
> > 
> > I see the point of checking the return value but per my understanding, and
> > correct me if i'm wrong, this rare case points to a corrupted ring which
> > can happen *only* in case of a bug so it is not "error" by nature.
> > If this is correct then i don't see the point of having this "question" on
> > every call to ib_notify_cq.
> > 
> > Do you agree to move this check to pvrdma_idx_ring_has_data and even make
> > the function use BUG_ON?
> 
> I'll concede that while it points to a corrupted ring (through a device bug, 
> memory corruption) but we want to report it as a device error to maintain
> consistency in our driver and give ULPs a chance to clean up. Also, the compiler
> optimization should help here.

Great, i understand that.
So, at least can you consider moving this dev_err into
pvrdma_idx_ring_has_data so callers do not need to handle errors?

btw, same apply to pvrdma_idx_ring_has_space

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux