If hardware queues are stopped for some event, like the device has been
suspended by power management, requests allocated on that hardware queue
are indefinitely stuck causing a queue freeze to wait forever.
I have a problem with this patch. IMO, this is a general issue so, so
why do we tie a fix to calling blk_mq_update_nr_hw_queues()? We might
not need to update nr_hw_queues at all. I'm fine with the
blk_mq_abandon_stopped_requests but not with its call-site.
Usually a driver knows when it wants to abandon all busy requests
blk_mq_tagset_busy_iter(), maybe the right approach is to add
a hook for all allocated tags? Or have blk_mq_quisce_queue get a
fail all requests parameter from the callers?
This patch is overly aggressive on failing allocated requests. There
are scenarios where we wouldn't want to abandon them, like if the hw
context is about to be brough back online, but this patch assumes all
need to be abandoned. I'll see if there's some other tricks we can have
a driver do. Thanks for the suggestions.
I agree,
I do think though that this should be driven from the driver, because
for fabrics, we might have some fabric error that triggers a periodic
reconnect. So the "hw context is about to be brought back" is unknown
from the driver pov, and when we delete the controller (because we give
up) this is exactly where we need to abandon the allocated requests.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html