On 2/27/20 1:32 PM, John Garry wrote:
Is blk_mq_hw_ctx.nr_active really the same as
scsi_device.device_busy?
*Both of them are not the same but it serves our purpose to get the
number of outstanding io requests. Please refer below link for more
details:*
https://lore.kernel.org/linux-scsi/20191105002334.GA11436@ming.t460p/
Thanks for the pointer, but there did not seem to be a conclusion there:
https://lore.kernel.org/linux-scsi/20191105002334.GA11436@ming.t460p/
Anyway, if we move to exposing multiple HW queues in the megaraid SAS
driver:
host->nr_hw_queues = instance->msix_vectors -
instance->low_latency_index_start;
Then hctx->nr_active will no longer be the total active requests per
host, but rather per hctx.
In addition, hctx->nr_active will no longer be properly maintained, as
it would be based on the hctx HW queue actually being used by the LLDD
for that request, which is not always true now. That is because in
megasas_get_msix_index() a judgement may be made to use a low-latency HW
queue instead:
static inline void
megasas_get_msix_index(struct megasas_instance *instance,
struct scsi_cmnd *scmd,
struct megasas_cmd_fusion *cmd,
u8 data_arms)
{
...
sdev_busy = atomic_read(&hctx->nr_active);
if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))
cmd->request_desc->SCSIIO.MSIxIndex =
mega_mod64(...),
else if (instance->msix_load_balance)
cmd->request_desc->SCSIIO.MSIxIndex =
(mega_mod64(...),
instance->msix_vectors));
Will this make a difference? I am not sure. Maybe, on this basis,
magaraid sas is not a good candidate to change to expose multiple queues.
Ignoring that for a moment, since we no longer keep a host busy count,
and I figure that we don't want to back to using
scsi_device.device_busy, is the judgement of hctx->nr_active ok to use
to decide whether to use these performance queues?
Personally, I wonder if the current implementation of high-IOPs queues
makes sense with multiqueue.
Thing is, the current high-IOPs queue mechanism of shifting I/O to
another internal queue doesn't align nicely with the blk-mq architecture.
What we _do_ have, though, is a 'poll' queue mechanism, allowing to
separate out one (or several) queues for polling, to allow for ..
indeed, high-IOPs.
So it would be interesting to figure out if we don't get similar
performance by using the 'poll' queue implementation from blk-mq instead
of the current one.
Which would also have the benefit that we could support the io_uring
interface natively with megaraid_sas, which I think would be a benefit
on its own.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer