On Mon, Oct 24, 2022 at 8:55 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > From: David Jeffery <djeffery@xxxxxxxxxx> > > David Jeffery found one double ->queue_rq() issue, so far it can > be triggered in VM use case because of long vmexit latency or preempt > latency of vCPU pthread or long page fault in vCPU pthread, then block > IO req could be timed out before queuing the request to hardware but after > calling blk_mq_start_request() during ->queue_rq(), then timeout handler > may handle it by requeue, then double ->queue_rq() is caused, and kernel > panic. > > So far, it is driver's responsibility to cover the race between timeout > and completion, so it seems supposed to be solved in driver in theory, > given driver has enough knowledge. > > But it is really one common problem, lots of driver could have similar > issue, and could be hard to fix all affected drivers, even it isn't easy > for driver to handle the race. So David suggests this patch by draining > in-progress ->queue_rq() for solving this issue. > > Cc: Bart Van Assche <bvanassche@xxxxxxx> > Cc: David Jeffery <djeffery@xxxxxxxxxx> > Cc: Stefan Hajnoczi <stefanha@xxxxxxxxxx> > Cc: Keith Busch <kbusch@xxxxxxxxxx> > Cc: virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > --- Thank you for your help and patience, Ming. Signed-off-by: David Jeffery <djeffery@xxxxxxxxxx>