RE: [PATCH 00/10] mpt3sas: full mq support

Kashyap Desai <kashyap.desai@xxxxxxxxxxxx> · Wed, 1 Feb 2017 12:37:02 +0530

> -----Original Message-----
> From: Hannes Reinecke [mailto:hare@xxxxxxx]
> Sent: Wednesday, February 01, 2017 12:21 PM
> To: Kashyap Desai; Christoph Hellwig
> Cc: Martin K. Petersen; James Bottomley; linux-scsi@xxxxxxxxxxxxxxx;
> Sathya
> Prakash Veerichetty; PDL-MPT-FUSIONLINUX; Sreekanth Reddy
> Subject: Re: [PATCH 00/10] mpt3sas: full mq support
>
> On 01/31/2017 06:54 PM, Kashyap Desai wrote:
> >> -----Original Message-----
> >> From: Hannes Reinecke [mailto:hare@xxxxxxx]
> >> Sent: Tuesday, January 31, 2017 4:47 PM
> >> To: Christoph Hellwig
> >> Cc: Martin K. Petersen; James Bottomley; linux-scsi@xxxxxxxxxxxxxxx;
> > Sathya
> >> Prakash; Kashyap Desai; mpt-fusionlinux.pdl@xxxxxxxxxxxx
> >> Subject: Re: [PATCH 00/10] mpt3sas: full mq support
> >>
> >> On 01/31/2017 11:02 AM, Christoph Hellwig wrote:
> >>> On Tue, Jan 31, 2017 at 10:25:50AM +0100, Hannes Reinecke wrote:
> >>>> Hi all,
> >>>>
> >>>> this is a patchset to enable full multiqueue support for the
> >>>> mpt3sas
> >> driver.
> >>>> While the HBA only has a single mailbox register for submitting
> >>>> commands, it does have individual receive queues per MSI-X
> >>>> interrupt and as such does benefit from converting it to full
> >>>> multiqueue
> > support.
> >>>
> >>> Explanation and numbers on why this would be beneficial, please.
> >>> We should not need multiple submissions queues for a single register
> >>> to benefit from multiple completion queues.
> >>>
> >> Well, the actual throughput very strongly depends on the blk-mq-sched
> >> patches from Jens.
> >> As this is barely finished I didn't post any numbers yet.
> >>
> >> However:
> >> With multiqueue support:
> >> 4k seq read : io=60573MB, bw=1009.2MB/s, iops=258353, runt=
> 60021msec
> >> With scsi-mq on 1 queue:
> >> 4k seq read : io=17369MB, bw=296291KB/s, iops=74072, runt= 60028msec
> >> So yes, there _is_ a benefit.
> >>
> >> (Which is actually quite cool, as these tests were done on a SAS3
> >> HBA,
> > so
> >> we're getting close to the theoretical maximum of 1.2GB/s).
> >> (Unlike the single-queue case :-)
> >
> > Hannes -
> >
> > Can you share detail about setup ? How many drives do you have and how
> > is connection (enclosure -> drives. ??) ?
> > To me it looks like current mpt3sas driver might be taking more hit in
> > spinlock operation (penalty on NUMA arch is more compare to single
> > core
> > server) unlike we have in megaraid_sas driver use of shared blk tag.
> >
> The tests were done with a single LSI SAS3008 connected to a NetApp E-
> series (2660), using 4 LUNs under MD-RAID0.
>
> Megaraid_sas is even worse here; due to the odd nature of the 'fusion'
> implementation we're ending up having _two_ sets of tags, making it really
> hard to use scsi-mq here.

Current megaraid_sas as single submission queue exposed to the blk-mq will
not encounter similar performance issue.
We may not see significant improvement of performance if we attempt the same
for megaraid_sas driver.
We had similar discussion for megaraid_sas and hpsa.
http://www.spinics.net/lists/linux-scsi/msg101838.html

I am seeing this patch series is similar attempt for mpt3sas..Am I missing
anything ?

Megaraid_sas driver just do indexing from blk_tag and fire IO quick enough
unlike mpt3sas where we have  lock contention @driver level as bottleneck.

> (Not that I didn't try; but lacking a proper backend it's really hard to
> evaluate
> the benefit of those ... spinning HDDs simply don't cut it here)
>
> > I mean " [PATCH 08/10] mpt3sas: lockless command submission for scsi-
> mq"
> > patch is improving performance removing spinlock overhead and
> > attempting to get request using blk_tags.
> > Are you seeing performance improvement  if you hard code nr_hw_queues
> > = 1 in below code changes part of "[PATCH 10/10] mpt3sas: scsi-mq
> > interrupt steering"
> >
> No. The numbers posted above are generated with exactly that patch; the
> first line is running with nr_hw_queues=32 and the second line with
> nr_hw_queues=1.

Thanks Hannes. That clarifies.  Can you share <fio> script you have used ?

If my  understanding correct, you will see theoretical maximum of 1.2GBp/s
if you restrict your work load to single numa node. This is just for
understanding if <mpt3sas> driver spinlocks are adding overhead. We have
seen such overhead on multi-socket server and it is reasonable to reduce
lock in mpt3sas driver, but only concern is exposing HBA for multiple
submission queue to blk-mq is really not required and trying to figure out
if we have any side effect of doing that.

>
> Curiously, though, patch 8/10 also reduces the 'can_queue' value by
> dividing
> it by the number of CPUs (required for blk tag space scaling).
> If I _increase_ can_queue after setting up the tagspace to the original
> value
> performance _drops_ again.
> Most unexpected; I'll be doing more experimenting there.
>
> Full results will be presented at VAULT, btw :-)
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke		      zSeries & Storage
> hare@xxxxxxx			      +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)