Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests

Hannes Reinecke <hare@xxxxxxx> · Mon, 2 Mar 2020 10:17:16 +0100

On 2/27/20 1:32 PM, John Garry wrote:

    Is blk_mq_hw_ctx.nr_active really the same as 
scsi_device.device_busy?

*Both of them are not the same but it serves our purpose to get the 
number of outstanding io requests. Please refer below link for more 
details:*

https://lore.kernel.org/linux-scsi/20191105002334.GA11436@ming.t460p/

Thanks for the pointer, but there did not seem to be a conclusion there: 
https://lore.kernel.org/linux-scsi/20191105002334.GA11436@ming.t460p/

Anyway, if we move to exposing multiple HW queues in the megaraid SAS 
driver:

  host->nr_hw_queues = instance->msix_vectors -
                       instance->low_latency_index_start;

Then hctx->nr_active will no longer be the total active requests per 
host, but rather per hctx.

In addition, hctx->nr_active will no longer be properly maintained, as 
it would be based on the hctx HW queue actually being used by the LLDD 
for that request, which is not always true now. That is because in 
megasas_get_msix_index() a judgement may be made to use a low-latency HW 
queue instead:

static inline void
megasas_get_msix_index(struct megasas_instance *instance,
                struct scsi_cmnd *scmd,
                struct megasas_cmd_fusion *cmd,
                u8 data_arms)
{
...

sdev_busy = atomic_read(&hctx->nr_active);

if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
     sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))
     cmd->request_desc->SCSIIO.MSIxIndex =
             mega_mod64(...),
     else if (instance->msix_load_balance)
         cmd->request_desc->SCSIIO.MSIxIndex =
             (mega_mod64(...),
                 instance->msix_vectors));

Will this make a difference? I am not sure. Maybe, on this basis, 
magaraid sas is not a good candidate to change to expose multiple queues.

Ignoring that for a moment, since we no longer keep a host busy count, 
and I figure that we don't want to back to using 
scsi_device.device_busy, is the judgement of hctx->nr_active ok to use 
to decide whether to use these performance queues?

Personally, I wonder if the current implementation of high-IOPs queues 
makes sense with multiqueue.
Thing is, the current high-IOPs queue mechanism of shifting I/O to 
another internal queue doesn't align nicely with the blk-mq architecture.
What we _do_ have, though, is a 'poll' queue mechanism, allowing to 
separate out one (or several) queues for polling, to allow for .. 
indeed, high-IOPs.
So it would be interesting to figure out if we don't get similar 
performance by using the 'poll' queue implementation from blk-mq instead 
of the current one.

Which would also have the benefit that we could support the io_uring 
interface natively with megaraid_sas, which I think would be a benefit 
on its own.

Cheers,

Hannes
--
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@xxxxxxx                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer