Re: [PATCH v7 for-next 2/2] RDMA/hns: Delayed flush cqe process with workqueue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 04, 2020 at 04:47:38PM +0800, Liuyixian (Eason) wrote:
> 
> 
> On 2020/1/29 4:05, Jason Gunthorpe wrote:
> > On Wed, Jan 15, 2020 at 05:49:13PM +0800, Yixian Liu wrote:
> >> diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
> >> index fa38582..ad7ed07 100644
> >> +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
> >> @@ -56,10 +56,16 @@ static void flush_work_handle(struct work_struct *work)
> >>  	attr_mask = IB_QP_STATE;
> >>  	attr.qp_state = IB_QPS_ERR;
> >>  
> >> -	ret = hns_roce_modify_qp(&hr_qp->ibqp, &attr, attr_mask, NULL);
> >> -	if (ret)
> >> -		dev_err(dev, "Modify QP to error state failed(%d) during CQE flush\n",
> >> -			ret);
> >> +	while (atomic_read(&hr_qp->flush_cnt)) {
> >> +		ret = hns_roce_modify_qp(&hr_qp->ibqp, &attr, attr_mask, NULL);
> >> +		if (ret)
> >> +			dev_err(dev, "Modify QP to error state failed(%d) during CQE flush\n",
> >> +				ret);
> >> +
> >> +		/* If flush_cnt larger than 1, only need one more time flush */
> >> +		if (atomic_dec_and_test(&hr_qp->flush_cnt))
> >> +			atomic_set(&hr_qp->flush_cnt, 1);
> >> +	}
> > 
> > And this while loop is just 
> 
> There is a bug here, the code should be:
> if (!atomic_dec_and_test(&hr_qp->flush_cnt))
> 	atomic_set(&hr_qp->flush_cnt, 1);
> 
> It merges all further flush operation requirements into only one more time flush,
> that is, do the loop once again if flush_cnt larger than 1.
> 
> > 
> > if (atomic_xchg(&hr_qp->flush_cnt, 0)) {
> >   [..]
> > }
> 
> I think we can't use if instead of while loop.

Well, you can't do two operations and still have an atomic, so you
have to fix it somehow. Possibly this needs a spinlock approach
instead.

> With your solution, when user posts a new wr during the
> implementation of [...] in if condition, it will re-queue a new
> init_flush_work, which will lead to a multiple call problem as we
> discussed in v2.

queue_work can be called while a work is still running, it just makes
sure it will run again.

> > I'm not even sure this needs to be a counter, all you need is set_bit()
> > and test_and_clear()
> 
> We need the value of flush_cnt large than 1 to record further flush
> requirements, that's why flush_cnt can be defined as a flag or bit
> value.

This explanation doesn't make sense, the counter isn't being used to
count anything, it is just a flag.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux