On Fri, Oct 21, 2022 at 11:22 AM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > On Fri, Oct 21, 2022 at 08:32:31AM -0600, Keith Busch wrote: > > > > I agree with your idea that this is a lower level driver responsibility: > > it should reclaim all started requests before allowing new queuing. > > Perhaps the block layer should also raise a clear warning if it's > > queueing a request that's already started. > > The thing is that it is one generic issue, lots of VM drivers could be > affected, and it may not be easy for drivers to handle the race too. > While virtual systems are a common source of the problem, fully preempt kernels (with or without real-time patches) can also trigger this condition rather simply with a poorly behaved real-time task. The involuntary preemption means the queue_rq call can be stopped to let another task run. Poorly behaving tasks claiming the CPU for longer than the request timeout when preempting a task in a queue_rq function could cause the condition on real or virtual hardware. So it's not just VM related drivers that are affected by the race. David Jeffery