On Fri, Jun 15, 2018 at 11:26 AM, jianchao.wang <jianchao.w.wang@xxxxxxxxxx> wrote: > Hi Ming > > On 06/15/2018 11:20 AM, Ming Lei wrote: >> On Fri, Jun 15, 2018 at 11:04 AM, jianchao.wang >> <jianchao.w.wang@xxxxxxxxxx> wrote: >>> Hi Ming >>> >>> Thanks for your kindly response. >>> >>> On 06/15/2018 10:56 AM, Ming Lei wrote: >>>>>> IMO, ref-counter is just to fix the blk-mq req life recycle issue. >>>>>> It cannot replace the blk_mark_rq_complete which could avoid the race between >>>>>> timeout and io completion path. >>>>> The .timeout return BLK_EH_DONE doesn't always mean the request has been completed. >>>>> Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out >>>>> request is still in abort or eh process. What if a completion irq come during that ? >>>> For blk-mq, it is avoided by the atomic state change in >>>> __blk_mq_complete_request(), >>>> that is why I mentioned the question in my last reply. >>>> >>> >>> but blk_mq_check_expired doesn't do that. >>> do I miss anything ? >> >> Right, that is the difference between blk-mq and legacy now, > > Sorry, I cannot follow your point. > blk_mq_check_expired doesn't do a atmoc state change from IN-FLIGHT to COMPLETE. > __blk_mq_complete_request could still proceed to complete a timed out request > which is in scsi abort or eh process. Is it really OK ? That is the idea of Christoph's patchset of 'complete requests from ->timeout', then drivers need to cover race between timeout and normal completeion. But at least the request won't be completed twice because of the atomic state change in __blk_mq_complete_request(). So what is your real concern about blk-mq's timeout? Thanks, Ming