On 11/20/19 9:00 AM, Ewan D. Milne wrote:
On Wed, 2019-11-20 at 11:05 +0100, Hannes Reinecke wrote:
I must admit I patently don't like this explicit dependency on
blk_nonrot(). Having a conditional counter is just an open invitation to
getting things wrong...
This concerns me as well, it seems like the SCSI ML should have it's
own per-device attribute if we actually need to control this per-device
instead of on a per-host or per-driver basis. And it seems like this
is something that is specific to high-performance drivers, so changing
the way this works for all drivers seems a bit much.
Ordinarily I'd prefer a host template attribute as Sumanesh proposed,
but I dislike wrapping the examination of that and the queue flag in
a macro that makes it not obvious how the behavior is affected.
(Plus Hannes just submitted submitted the patches to remove .use_cmd_list,
which was another piece of ML functionality used by only a few drivers.)
Ming's patch does freeze the queue if NONROT is changed by sysfs, but
the flag can be changed by other kernel code, e.g. sd_revalidate_disk()
clears it and then calls sd_read_block_characteristics() which may set
it again. So it's not clear to me how reliable this is.
How about changing the default behavior into ignoring sdev->queue_depth
and only honoring sdev->queue_depth if a "quirk" flag is set or if
overridden by setting a sysfs attribute? My understanding is that the
goal of the queue ramp-up/ramp-down mechanism is to reduce the number of
times a SCSI device responds "BUSY". An alternative for queue
ramp-up/ramp-down is a delayed queue re-run. I think if scsi_queue_rq()
returns BLK_STS_RESOURCE that the queue is only rerun after a delay.
From blk_mq_dispatch_rq_list():
[ ... ]
else if (needs_restart && (ret == BLK_STS_RESOURCE))
blk_mq_delay_run_hw_queue(hctx, BLK_MQ_RESOURCE_DELAY);
Bart.