Re: [RFC PATCH 3/3] blk-mq: Remove generation seqeunce

Bart Van Assche <Bart.VanAssche@xxxxxxx> · Thu, 12 Jul 2018 22:24:42 +0000

On Thu, 2018-07-12 at 13:24 -0600, Keith Busch wrote:
> On Thu, Jul 12, 2018 at 06:16:12PM +0000, Bart Van Assche wrote:
> > What prevents that a request finishes and gets reused after the
> > blk_mq_req_expired() call has finished and before kref_get_unless_zero() is
> > called? Is this perhaps a race condition that has not yet been triggered by
> > any existing block layer test? Please note that there is no such race
> > condition in the patch I had posted ("blk-mq: Rework blk-mq timeout handling
> > again" - https://www.spinics.net/lists/linux-block/msg26489.html).
> 
> I don't think there's any such race in the merged implementation
> either. If the request is reallocated, then the kref check may succeed,
> but the blk_mq_req_expired() check would surely fail, allowing the
> request to proceed as normal. The code comments at least say as much.

Hello Keith,

Before commit 12f5b9314545 ("blk-mq: Remove generation seqeunce"), if a
request completion was reported after request timeout processing had
started, completion handling was skipped. The following code in
blk_mq_complete_request() realized that:

	if (blk_mq_rq_aborted_gstate(rq) != rq->gstate)
		__blk_mq_complete_request(rq);

Since commit 12f5b9314545, if a completion occurs after request timeout
processing has started, that completion is processed if the request has the
state MQ_RQ_IN_FLIGHT. blk_mq_rq_timed_out() does not modify the request
state unless the block driver timeout handler modifies it, e.g. by calling
blk_mq_end_request() or by calling blk_mq_requeue_request(). The typical
behavior of scsi_times_out() is to queue sending of a SCSI abort and hence
not to change the request state immediately. In other words, if a request
completion occurs during or shortly after a timeout occurred then
blk_mq_complete_request() will call __blk_mq_complete_request() and will
complete the request, although that is not allowed because timeout handling
has already started. Do you agree with this analysis?

Thanks,

Bart.