On Mon, 2018-04-02 at 14:10 -0700, Tejun Heo wrote: > On Mon, Apr 02, 2018 at 02:08:37PM -0700, Bart Van Assche wrote: > > On 04/02/18 12:01, Tejun Heo wrote: > > > + * As nothing prevents from completion happening while > > > + * ->aborted_gstate is set, this may lead to ignored completions > > > + * and further spurious timeouts. > > > + */ > > > + if (rq->rq_flags & RQF_MQ_TIMEOUT_RESET) > > > + blk_mq_rq_update_aborted_gstate(rq, 0); > > > > Hello Tejun, > > > > Since this patch fixes one race but introduces another race, is this > > patch really an improvement? > > Oh, that's not a new race. That's the same non-critical race which > always existed. It's just being documented. Hello Tejun, I think it can happen that the above code sees that (rq->rq_flags & RQF_MQ_TIMEOUT_RESET) != 0, that blk_mq_start_request() executes the following code: blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT); blk_add_timer(rq); and that subsequently blk_mq_rq_update_aborted_gstate(rq, 0) is called, which will cause the next completion to be lost. Is fixing one occurrence of a race and reintroducing it in another code path really an improvement? Thanks, Bart.