Re: [PATCH V3 for 5.11 00/12] blk-mq/scsi: tracking device queue depth via sbitmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ming,

I have tested this patch extensively in our labs.

This patch gives excellent results when a single device can provide
very high IOPs, and only a few of those devices are available on the
system.
Thus, if a RAID 0 volume is created out of many high end NVMe devices,
then that RAID0 volume can potentially reach a max IOPs that is a
summation of the maxs IOPS for all the underlying drives. Without this
patch, the current kernel code cannot get there.

For example, for a simple RAID0 volume with 32 NVMe drives, I got
almost 100% performance boost with this patch.
The NVMe stack does not have this limitation, and this patch goes a
long way in closing that gap.

I have also tested it in many other configurations, and  did not see
any adverse side effects.

Please feel free to add:
Tested-by: Sumanesh Samanta

Thanks,
Sumanesh




On Tue, Sep 22, 2020 at 7:33 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote:
>
> Hi,
>
> scsi uses one global atomic variable to track queue depth for each
> LUN/request queue. This way can't scale well when there is lots of CPU
> cores and the disk is very fast. Broadcom guys has complained that their
> high end HBA can't reach top performance because .device_busy is
> operated in IO path.
>
> Replace the atomic variable sdev->device_busy with sbitmap for
> tracking scsi device queue depth.
>
> Test on scsi_debug shows this way improve IOPS > 20%. Meantime
> the IOPS difference is just ~1% compared with bypassing .device_busy
> on scsi_debug via patches[1]
>
> The 1st 6 patches moves percpu allocation hint into sbitmap, since
> the improvement by doing percpu allocation hint on sbitmap is observable.
> Meantime export helpers for SCSI.
>
> Patch 7 and 8 prepares for the conversion by returning budget token
> from .get_budget callback, meantime passes the budget token to driver
> via 'struct blk_mq_queue_data' in .queue_rq().
>
> The last four patches changes SCSI for switching to track device queue
> depth via sbitmap.
>
> The patchset have been tested by Broadcom, and obvious performance boost
> can be observed.
>
> Given it is based on both for-5.10/block and 5.10/scsi-queue, the target
> is for v5.11. And it is posted out just for getting full/enough review.
>
> Please comment and review!
>
> V3:
>         - rebase on both for-5.10/block and 5.10/scsi-queue.
>
> V2:
>         - fix one build failure
>
>
> Ming Lei (12):
>   sbitmap: remove sbitmap_clear_bit_unlock
>   sbitmap: maintain allocation round_robin in sbitmap
>   sbitmap: add helpers for updating allocation hint
>   sbitmap: move allocation hint into sbitmap
>   sbitmap: export sbitmap_weight
>   sbitmap: add helper of sbitmap_calculate_shift
>   blk-mq: add callbacks for storing & retrieving budget token
>   blk-mq: return budget token from .get_budget callback
>   scsi: put hot fields of scsi_host_template into one cacheline
>   scsi: add scsi_device_busy() to read sdev->device_busy
>   scsi: make sure sdev->queue_depth is <= shost->can_queue
>   scsi: replace sdev->device_busy with sbitmap
>
>  block/blk-mq-sched.c                 |  17 ++-
>  block/blk-mq.c                       |  38 +++--
>  block/blk-mq.h                       |  25 +++-
>  block/kyber-iosched.c                |   3 +-
>  drivers/message/fusion/mptsas.c      |   2 +-
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c |   2 +-
>  drivers/scsi/scsi.c                  |   4 +
>  drivers/scsi/scsi_lib.c              |  69 ++++++---
>  drivers/scsi/scsi_priv.h             |   1 +
>  drivers/scsi/scsi_scan.c             |  22 ++-
>  drivers/scsi/scsi_sysfs.c            |   4 +-
>  drivers/scsi/sg.c                    |   2 +-
>  include/linux/blk-mq.h               |  13 +-
>  include/linux/sbitmap.h              |  84 +++++++----
>  include/scsi/scsi_cmnd.h             |   2 +
>  include/scsi/scsi_device.h           |   8 +-
>  include/scsi/scsi_host.h             |  72 ++++-----
>  lib/sbitmap.c                        | 213 +++++++++++++++------------
>  18 files changed, 376 insertions(+), 205 deletions(-)
>
> Cc: Omar Sandoval <osandov@xxxxxx>
> Cc: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx>
> Cc: Sumanesh Samanta <sumanesh.samanta@xxxxxxxxxxxx>
> Cc: Ewan D. Milne <emilne@xxxxxxxxxx>
> Cc: Hannes Reinecke <hare@xxxxxxx>
>
> --
> 2.25.2
>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux