On Mon, Mar 2, 2020 at 3:21 PM John Garry <john.garry@xxxxxxxxxx> wrote: > > > >> static inline void > >> megasas_get_msix_index(struct megasas_instance *instance, > >> struct scsi_cmnd *scmd, > >> struct megasas_cmd_fusion *cmd, > >> u8 data_arms) > >> { > >> ... > >> > >> sdev_busy = atomic_read(&hctx->nr_active); > >> > >> if (instance->perf_mode == MR_BALANCED_PERF_MODE && > >> sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH)) > >> cmd->request_desc->SCSIIO.MSIxIndex = > >> mega_mod64(...), > >> else if (instance->msix_load_balance) > >> cmd->request_desc->SCSIIO.MSIxIndex = > >> (mega_mod64(...), > >> instance->msix_vectors)); > >> > >> Will this make a difference? I am not sure. Maybe, on this basis, > >> magaraid sas is not a good candidate to change to expose multiple queues. > >> > >> Ignoring that for a moment, since we no longer keep a host busy count, > >> and I figure that we don't want to back to using > >> scsi_device.device_busy, is the judgement of hctx->nr_active ok to use > >> to decide whether to use these performance queues? > >> > > Personally, I wonder if the current implementation of high-IOPs queues > > makes sense with multiqueue. > Thing is, the current high-IOPs queue mechanism of shifting I/O to > > another internal queue doesn't align nicely with the blk-mq architecture. > > Right, we should not be hiding HW queues from blk-mq like this. This > breaks the symmetry. Maybe we can move this functionality to blk-mq, but > I doubt that this is a common use case. We added this concept of extra queues for megraid_sas latest gen of controllers for performance reasons. Here is some background- https://lore.kernel.org/lkml/20180829084618.GA24765@ming.t460p/t/ We worked with the community to have such interface for managed(for low latency queues) and non-managed(High IOPs queues) interrupts co-existence. > > > What we _do_ have, though, is a 'poll' queue mechanism, allowing to > > separate out one (or several) queues for polling, to allow for .. > > indeed, high-IOPs. > > Any examples or references for this? > > > So it would be interesting to figure out if we don't get similar > > performance by using the 'poll' queue implementation from blk-mq instead > > of the current one. > > I thought that this driver/or mpt3sas already used a polling mode. > > And for these low-latency queues, I figure that the issue is not just > polling vs interrupt, but indeed how fast the HW queue can consume SQEs. > A HW queue may only be able to consume at a limited rate - that's why we > segregate. Yes, there is no polling in any of HW queues. High IOPs queues have interrupt coalescing enabled whereas low latency queues does not have interrupt coalescing. megaraid_sas driver would choose which set of queues among these two has to be used depending on workload. For latency oriented workload, driver would use low latency queues and for IOPs profile, driver would use High IOPs queues. > > As an aside, that is actually an issue for blk-mq. For 1 to many HW > queue-to-CPU mapping, limiting many CPUs a single queue can limit IOPs > since HW queues can only consume at a limited rate. We were able to achieve performance target for MegaRAID latest gen controller with this model of few set of HW queues mapped to local numa CPUs and low latency queues has one to one mapping to CPUs. This is default behavior of queues segregation in megaraid_sas driver to satisfy our IOPs and latency requirements altogether. However we have given module parameter- "perf_mode" to tune queues behavior. i.e turning on/off interrupt coalescing on all HW queues where this one to many queues to CPU mapping would not happen. Thanks, Sumit > > > > > Which would also have the benefit that we could support the io_uring > > interface natively with megaraid_sas, which I think would be a benefit > > on its own. > > > > thanks, > John >