Re: [PATCH v7 for-next 2/2] RDMA/hns: Delayed flush cqe process with workqueue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/2/6 4:30, Jason Gunthorpe wrote:
> On Tue, Feb 04, 2020 at 04:47:38PM +0800, Liuyixian (Eason) wrote:
>>
>>
>> On 2020/1/29 4:05, Jason Gunthorpe wrote:
>>> On Wed, Jan 15, 2020 at 05:49:13PM +0800, Yixian Liu wrote:
>>>> diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
>>>> index fa38582..ad7ed07 100644
>>>> +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
>>>> @@ -56,10 +56,16 @@ static void flush_work_handle(struct work_struct *work)
>>>>  	attr_mask = IB_QP_STATE;
>>>>  	attr.qp_state = IB_QPS_ERR;
>>>>  
>>>> -	ret = hns_roce_modify_qp(&hr_qp->ibqp, &attr, attr_mask, NULL);
>>>> -	if (ret)
>>>> -		dev_err(dev, "Modify QP to error state failed(%d) during CQE flush\n",
>>>> -			ret);
>>>> +	while (atomic_read(&hr_qp->flush_cnt)) {
>>>> +		ret = hns_roce_modify_qp(&hr_qp->ibqp, &attr, attr_mask, NULL);
>>>> +		if (ret)
>>>> +			dev_err(dev, "Modify QP to error state failed(%d) during CQE flush\n",
>>>> +				ret);
>>>> +
>>>> +		/* If flush_cnt larger than 1, only need one more time flush */
>>>> +		if (atomic_dec_and_test(&hr_qp->flush_cnt))
>>>> +			atomic_set(&hr_qp->flush_cnt, 1);
>>>> +	}
>>>
>>> And this while loop is just 
>>
>> There is a bug here, the code should be:
>> if (!atomic_dec_and_test(&hr_qp->flush_cnt))
>> 	atomic_set(&hr_qp->flush_cnt, 1);
>>
>> It merges all further flush operation requirements into only one more time flush,
>> that is, do the loop once again if flush_cnt larger than 1.
>>
>>>
>>> if (atomic_xchg(&hr_qp->flush_cnt, 0)) {
>>>   [..]
>>> }
>>
>> I think we can't use if instead of while loop.
> 
> Well, you can't do two operations and still have an atomic, so you
> have to fix it somehow. Possibly this needs a spinlock approach
> instead.

Agree.

> 
>> With your solution, when user posts a new wr during the
>> implementation of [...] in if condition, it will re-queue a new
>> init_flush_work, which will lead to a multiple call problem as we
>> discussed in v2.
> 
> queue_work can be called while a work is still running, it just makes
> sure it will run again.

Agree.

> 
>>> I'm not even sure this needs to be a counter, all you need is set_bit()
>>> and test_and_clear()
>>
>> We need the value of flush_cnt large than 1 to record further flush
>> requirements, that's why flush_cnt can be defined as a flag or bit
>> value.
> 
> This explanation doesn't make sense, the counter isn't being used to
> count anything, it is just a flag.

Yes, you are right. I have reconsidered the solution with your suggestion,
flag is enough for whole solution. Will fix it in v8 with flag idea.

Thanks a lot.

> 
> Jason
> 
> 




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux