Re: [PATCH v3] lpfc: Mitigate high memory pre-allocation by SCSI-MQ

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/26/2019 12:18 AM, Hannes Reinecke wrote:
On 8/16/19 4:36 AM, James Smart wrote:
When SCSI-MQ is enabled, the SCSI-MQ layers will do pre-allocation of
MQ resources based on shost values set by the driver. In newer cases
of the driver, which attempts to set nr_hw_queues to the cpu count,
the multipliers become excessive, with a single shost having SCSI-MQ
pre-allocation reaching into the multiple GBytes range.  NPIV, which
creates additional shosts, only multiply this overhead. On lower-memory
systems, this can exhaust system memory very quickly, resulting in a
system crash or failures in the driver or elsewhere due to low memory
conditions.

After testing several scenarios, the situation can be mitigated by
limiting the value set in shost->nr_hw_queues to 4. Although the shost
values were changed, the driver still had per-cpu hardware queues of
its own that allowed parallelization per-cpu.  Testing revealed that
even with the smallish number for nr_hw_queues for SCSI-MQ, performance
levels remained near maximum with the within-driver affiinitization.

A module parameter was created to allow the value set for the
nr_hw_queues to be tunable.

Signed-off-by: Dick Kennedy <dick.kennedy@xxxxxxxxxxxx>
Signed-off-by: James Smart <jsmart2021@xxxxxxxxx>
Reviewed-by: Ming Lei <ming.lei@xxxxxxxxxx>

---
v3: add Ming's reviewed-by tag
---
  drivers/scsi/lpfc/lpfc.h      |  1 +
  drivers/scsi/lpfc/lpfc_attr.c | 15 +++++++++++++++
  drivers/scsi/lpfc/lpfc_init.c | 10 ++++++----
  drivers/scsi/lpfc/lpfc_sli4.h |  5 +++++
  4 files changed, 27 insertions(+), 4 deletions(-)

Well, that doesn't actually match with my measurements (where I've seen
max I/O performance at about 16 queues); so I guess this is pretty much
setup-specific.

Keep in mind, when we ran our benchmarks, the driver was still using per-cpu hdwq's selected by cpu #.


However, I'm somewhat loath to have a cap at 128; we actually have
several machines where we'll be having more CPUs than that.
Can't we increase the cap to 512 to give us a bit more leeway during
testing?

I'm fine if you want me to raise the max for the attribute. Keep in mind, if 0, it can go > 128 to whatever the cpu number is, assuming it's > 128.

-- james




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux