On 8/26/2019 12:18 AM, Hannes Reinecke wrote:
On 8/16/19 4:36 AM, James Smart wrote:
When SCSI-MQ is enabled, the SCSI-MQ layers will do pre-allocation of
MQ resources based on shost values set by the driver. In newer cases
of the driver, which attempts to set nr_hw_queues to the cpu count,
the multipliers become excessive, with a single shost having SCSI-MQ
pre-allocation reaching into the multiple GBytes range. NPIV, which
creates additional shosts, only multiply this overhead. On lower-memory
systems, this can exhaust system memory very quickly, resulting in a
system crash or failures in the driver or elsewhere due to low memory
conditions.
After testing several scenarios, the situation can be mitigated by
limiting the value set in shost->nr_hw_queues to 4. Although the shost
values were changed, the driver still had per-cpu hardware queues of
its own that allowed parallelization per-cpu. Testing revealed that
even with the smallish number for nr_hw_queues for SCSI-MQ, performance
levels remained near maximum with the within-driver affiinitization.
A module parameter was created to allow the value set for the
nr_hw_queues to be tunable.
Signed-off-by: Dick Kennedy <dick.kennedy@xxxxxxxxxxxx>
Signed-off-by: James Smart <jsmart2021@xxxxxxxxx>
Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>
---
v3: add Ming's reviewed-by tag
---
drivers/scsi/lpfc/lpfc.h | 1 +
drivers/scsi/lpfc/lpfc_attr.c | 15 +++++++++++++++
drivers/scsi/lpfc/lpfc_init.c | 10 ++++++----
drivers/scsi/lpfc/lpfc_sli4.h | 5 +++++
4 files changed, 27 insertions(+), 4 deletions(-)
Well, that doesn't actually match with my measurements (where I've seen
max I/O performance at about 16 queues); so I guess this is pretty much
setup-specific.
Keep in mind, when we ran our benchmarks, the driver was still using
per-cpu hdwq's selected by cpu #.
However, I'm somewhat loath to have a cap at 128; we actually have
several machines where we'll be having more CPUs than that.
Can't we increase the cap to 512 to give us a bit more leeway during
testing?
I'm fine if you want me to raise the max for the attribute. Keep in
mind, if 0, it can go > 128 to whatever the cpu number is, assuming it's
> 128.
-- james