RE: [PATCH V4 2/4] scsi: megaraid_sas: fix selection of reply queue

Kashyap Desai <kashyap.desai@xxxxxxxxxxxx> · Fri, 9 Mar 2018 19:33:09 +0530

> -----Original Message-----
> From: Ming Lei [mailto:ming.lei@xxxxxxxxxx]
> Sent: Friday, March 9, 2018 5:33 PM
> To: Kashyap Desai
> Cc: James Bottomley; Jens Axboe; Martin K . Petersen; Christoph Hellwig;
> linux-scsi@xxxxxxxxxxxxxxx; linux-block@xxxxxxxxxxxxxxx; Meelis Roos;
Don
> Brace; Laurence Oberman; Mike Snitzer; Hannes Reinecke; Artem Bityutskiy
> Subject: Re: [PATCH V4 2/4] scsi: megaraid_sas: fix selection of reply
queue
>
> On Fri, Mar 09, 2018 at 04:37:56PM +0530, Kashyap Desai wrote:
> > > -----Original Message-----
> > > From: Ming Lei [mailto:ming.lei@xxxxxxxxxx]
> > > Sent: Friday, March 9, 2018 9:02 AM
> > > To: James Bottomley; Jens Axboe; Martin K . Petersen
> > > Cc: Christoph Hellwig; linux-scsi@xxxxxxxxxxxxxxx; linux-
> > > block@xxxxxxxxxxxxxxx; Meelis Roos; Don Brace; Kashyap Desai;
> > > Laurence Oberman; Mike Snitzer; Ming Lei; Hannes Reinecke; James
> > > Bottomley; Artem Bityutskiy
> > > Subject: [PATCH V4 2/4] scsi: megaraid_sas: fix selection of reply
> > > queue
> > >
> > > From 84676c1f21 (genirq/affinity: assign vectors to all possible
> > > CPUs),
> > one
> > > msix vector can be created without any online CPU mapped, then
> > > command may be queued, and won't be notified after its completion.
> > >
> > > This patch setups mapping between cpu and reply queue according to
> > > irq affinity info retrived by pci_irq_get_affinity(), and uses this
> > > info to
> > choose
> > > reply queue for queuing one command.
> > >
> > > Then the chosen reply queue has to be active, and fixes IO hang
> > > caused
> > by
> > > using inactive reply queue which doesn't have any online CPU mapped.
> >
> > Also megaraid FW will use reply queue 0 for any async notification.
> > We want to set pre_vectors = 1 and make sure reply queue 0 is not part
> > of affinity hint.
> > To meet that requirement, I have to make some more changes like add
> > extra queue.
> > Example if reply queue supported by FW is 96 and online CPU is 16,
> > current driver will allocate 16 msix vector. We may have to allocate
> > 17 msix vector and reserve reply queue 0 for async reply from FW.
> >
> > I will be sending follow up patch soon.
>
> OK, but the above extra change shouldn't belong to this patch, which
focuses
> on fixing IO hang because of reply queue selection.

Fine. That will be a  separate patch to handle reply queue 0 affinity
case.
>
> >
> > >
> > > Cc: Hannes Reinecke <hare@xxxxxxx>
> > > Cc: "Martin K. Petersen" <martin.petersen@xxxxxxxxxx>,
> > > Cc: James Bottomley <james.bottomley@xxxxxxxxxxxxxxxxxxxxx>,
> > > Cc: Christoph Hellwig <hch@xxxxxx>,
> > > Cc: Don Brace <don.brace@xxxxxxxxxxxxx>
> > > Cc: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx>
> > > Cc: Laurence Oberman <loberman@xxxxxxxxxx>
> > > Cc: Mike Snitzer <snitzer@xxxxxxxxxx>
> > > Cc: Meelis Roos <mroos@xxxxxxxx>
> > > Cc: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxx>
> > > Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all
> > > possible
> > CPUs")
> > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > > ---
> > >  drivers/scsi/megaraid/megaraid_sas.h        |  2 +-
> > >  drivers/scsi/megaraid/megaraid_sas_base.c   | 34
> > > ++++++++++++++++++++++++++++-
> > >  drivers/scsi/megaraid/megaraid_sas_fusion.c | 12 ++++------
> > >  3 files changed, 38 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> > > b/drivers/scsi/megaraid/megaraid_sas.h
> > > index ba6503f37756..a644d2be55b6 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas.h
> > > +++ b/drivers/scsi/megaraid/megaraid_sas.h
> > > @@ -2127,7 +2127,7 @@ enum MR_PD_TYPE {
> > >  #define MR_NVME_PAGE_SIZE_MASK		0x000000FF
> > >
> > >  struct megasas_instance {
> > > -
> > > +	unsigned int *reply_map;
> > >  	__le32 *producer;
> > >  	dma_addr_t producer_h;
> > >  	__le32 *consumer;
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > index a71ee67df084..065956cb2aeb 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > @@ -5165,6 +5165,26 @@ megasas_setup_jbod_map(struct
> > > megasas_instance *instance)
> > >  		instance->use_seqnum_jbod_fp = false;  }
> > >
> > > +static void megasas_setup_reply_map(struct megasas_instance
> > > +*instance) {
> > > +	const struct cpumask *mask;
> > > +	unsigned int queue, cpu;
> > > +
> > > +	for (queue = 0; queue < instance->msix_vectors; queue++) {
> > > +		mask = pci_irq_get_affinity(instance->pdev, queue);
> > > +		if (!mask)
> > > +			goto fallback;
> > > +
> > > +		for_each_cpu(cpu, mask)
> > > +			instance->reply_map[cpu] = queue;
> > > +	}
> > > +	return;
> > > +
> > > +fallback:
> > > +	for_each_possible_cpu(cpu)
> > > +		instance->reply_map[cpu] = 0;
> >
> > Fallback should be better instead of just assigning to single reply
queue.
> > May be something like below.
> >
> >    for_each_possible_cpu(cpu)
> >        instance->reply_map[cpu] = cpu % instance->msix_vectors;;
> >
> > If smp_affinity_enable module parameter is set to 0, I see performance
> > drop because IO is going to single reply queue.
>
> OK, that looks one issue. If smp_affinity_enable is set as 0, we should
follow
> the original way.

Sent you the patch.
>
> >
> > > +}
> > > +
> > >  /**
> > >   * megasas_init_fw -	Initializes the FW
> > >   * @instance:		Adapter soft state
> > > @@ -5343,6 +5363,8 @@ static int megasas_init_fw(struct
> > > megasas_instance
> > > *instance)
> > >  			goto fail_setup_irqs;
> > >  	}
> > >
> > > +	megasas_setup_reply_map(instance);
> > > +
> > >  	dev_info(&instance->pdev->dev,
> > >  		"firmware supports msix\t: (%d)", fw_msix_count);
> > >  	dev_info(&instance->pdev->dev,
> > > @@ -6448,6 +6470,11 @@ static int megasas_probe_one(struct pci_dev
> > > *pdev,
> > >  	memset(instance, 0, sizeof(*instance));
> > >  	atomic_set(&instance->fw_reset_no_pci_access, 0);
> > >
> > > +	instance->reply_map = kzalloc(sizeof(unsigned int) * nr_cpu_ids,
> > > +			GFP_KERNEL);
> > > +	if (!instance->reply_map)
> > > +		goto fail_alloc_reply_map;
> > > +
> > We have common function  megasas_alloc_ctrl_mem() to manage
> allocation.
> > Good candidate to move this allocation to megasas_alloc_ctrl_mem.
>
> Good point, will do that in V5.

Sent you the patch.

>
>
> --
> Ming