Re: [PATCH v2] blk-mq: Fix race between resetting the timer and completion handling

Bart Van Assche <Bart.VanAssche@xxxxxxx> · Wed, 14 Feb 2018 16:58:56 +0000

On Tue, 2018-02-13 at 13:20 -0800, tj@xxxxxxxxxx wrote:
> On Thu, Feb 08, 2018 at 04:31:43PM +0000, Bart Van Assche wrote:
> > The crash is reported at address scsi_times_out+0x17 == scsi_times_out+23. The
> > instruction at that address tries to dereference scsi_cmnd.device (%rax). The
> > register dump shows that that pointer has the value NULL. The only function I
> > know of that clears the scsi_cmnd.device pointer is scsi_req_init(). The only
> > caller of that function in the SCSI core is scsi_initialize_rq(). That function
> > has two callers, namely scsi_init_command() and blk_get_request(). However,
> > the scsi_cmnd.device pointer is not cleared when a request finishes. This is
> > why I think that the above crash report indicates that scsi_times_out() was
> > called for a request that was being reinitialized and not by device hotplugging.
> 
> Can you please give the following patch a shot?  While timeout path is
> synchornizing against the completion path (and the following re-init)
> while taking back control of a timed-out request, it wasn't doing that
> while giving it back, so the timer registration could race against
> completion and re-issue.  I'm still not quite sure how that can lead
> to the oops tho.  Anyways, we need something like this one way or the
> other.
> 
> This isn't the final patch.  We should add batching-up of rcu
> synchronize calls similar to the abort path.
> 
> Thanks.
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index df93102..b66aec3 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -816,7 +816,8 @@ struct blk_mq_timeout_data {
>  	unsigned int nr_expired;
>  };
>  
> -static void blk_mq_rq_timed_out(struct request *req, bool reserved)
> +static void blk_mq_rq_timed_out(struct blk_mq_hw_ctx *hctx, struct request *req,
> +				bool reserved)
>  {
>  	const struct blk_mq_ops *ops = req->q->mq_ops;
>  	enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
> @@ -836,8 +837,12 @@ static void blk_mq_rq_timed_out(struct request *req, bool reserved)
>  		 * ->aborted_gstate is set, this may lead to ignored
>  		 * completions and further spurious timeouts.
>  		 */
> -		blk_mq_rq_update_aborted_gstate(req, 0);
>  		blk_add_timer(req);
> +		if (!(hctx->flags & BLK_MQ_F_BLOCKING))
> +			synchronize_rcu();
> +		else
> +			synchronize_srcu(hctx->srcu);
> +		blk_mq_rq_update_aborted_gstate(req, 0);
>  		break;
>  	case BLK_EH_NOT_HANDLED:
>  		break;
> @@ -893,7 +898,7 @@ static void blk_mq_terminate_expired(struct blk_mq_hw_ctx *hctx,
>  	 */
>  	if (!(rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED) &&
>  	    READ_ONCE(rq->gstate) == rq->aborted_gstate)
> -		blk_mq_rq_timed_out(rq, reserved);
> +		blk_mq_rq_timed_out(hctx, rq, reserved);
>  }
>  
>  static void blk_mq_timeout_work(struct work_struct *work)

Hello Tejun,

With this patch applied the tests I ran so far pass.

Thanks,

Bart.