Re: [Bug] double ->queue_rq() because of timeout in ->queue_rq()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



blk_On Fri, Oct 21, 2022 at 10:40 AM Keith Busch <kbusch@xxxxxxxxxx> wrote:
>
> The blk_mq_wait_quiesce_done() will only wait for tasks that entered
> just before calling that function. It will not wait for tasks that
> entered immediately after.

Right. The design was depending on saving a jiffies value before the
blk_mq_wait_quiesce_done() call and deadline behavior to avoid having
to care about queue_rq calls starting after
blk_mq_wait_quiesce_done().

> If I correctly understand the problem you're describing, the hypervisor
> may prevent any guest process from running. If so, the timeout work may
> be stalled after the quiesce, and if a queue_rq() process also stalled
> after starting quiesce_done(), then we're in the same situation you're
> trying to prevent, right?

For handling this condition and avoiding the issue, the code was
depending on the deadline value and rq->state value.
blk_mq_start_request sets a deadline then sets MQ_RQ_IN_FLIGHT, while
blk_mq_req_expired checks for MQ_RQ_IN_FLIGHT then checks the
deadline. A new queue_rq call would then not count as expired from
these checks from either its deadline being updated to be in the
future or from missing MQ_RQ_IN_FLIGHT .

However, there are no memory barriers ensuring this ordering. So it
looks like it should prevent an issue with a new queue_rq on x86 cpus
but could race on cpus with weaker memory ordering.


> I agree with your idea that this is a lower level driver responsibility:
> it should reclaim all started requests before allowing new queuing.
> Perhaps the block layer should also raise a clear warning if it's
> queueing a request that's already started.
>

If I understand what you are saying, that is roughly how the old scsi
timeout handling used to work before the async-abort changes. It
prevented new requests and drained everything (though HBA-wide not
just request_queue-wide) before doing anything with timed out
commands. The current implementation makes avoiding race conditions
more difficult but reduces the performance impact of a lone
misbehaving request.

So a driver under the block layer could potentially avoid the current
races by properly quiescing the request_queue before handling any
timed out requests. But I was trying to come up with a solution which
avoided that requirement.

David Jeffery




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux