On Wed, Oct 26, 2022 at 01:19:57PM +0800, Ming Lei wrote: > From: David Jeffery <djeffery@xxxxxxxxxx> > > David Jeffery found one double ->queue_rq() issue, so far it can > be triggered in VM use case because of long vmexit latency or preempt > latency of vCPU pthread or long page fault in vCPU pthread, then block > IO req could be timed out before queuing the request to hardware but after > calling blk_mq_start_request() during ->queue_rq(), then timeout handler > may handle it by requeue, then double ->queue_rq() is caused, and kernel > panic. > > So far, it is driver's responsibility to cover the race between timeout > and completion, so it seems supposed to be solved in driver in theory, > given driver has enough knowledge. > > But it is really one common problem, lots of driver could have similar > issue, and could be hard to fix all affected drivers, even it isn't easy > for driver to handle the race. So David suggests this patch by draining > in-progress ->queue_rq() for solving this issue. > > Cc: Stefan Hajnoczi <stefanha@xxxxxxxxxx> > Cc: Keith Busch <kbusch@xxxxxxxxxx> > Cc: virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx > Cc: Bart Van Assche <bvanassche@xxxxxxx> > Signed-off-by: David Jeffery <djeffery@xxxxxxxxxx> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > --- > V3: > - add callback for handle expired only, suggested by Keith Busch Hi Jens, Any chance to merge this fix? Either 6.1 or 6.2 is fine for me. Thanks, Ming