On Fri, Mar 09, 2018 at 04:37:56PM +0530, Kashyap Desai wrote: > > -----Original Message----- > > From: Ming Lei [mailto:ming.lei@xxxxxxxxxx] > > Sent: Friday, March 9, 2018 9:02 AM > > To: James Bottomley; Jens Axboe; Martin K . Petersen > > Cc: Christoph Hellwig; linux-scsi@xxxxxxxxxxxxxxx; linux- > > block@xxxxxxxxxxxxxxx; Meelis Roos; Don Brace; Kashyap Desai; Laurence > > Oberman; Mike Snitzer; Ming Lei; Hannes Reinecke; James Bottomley; Artem > > Bityutskiy > > Subject: [PATCH V4 2/4] scsi: megaraid_sas: fix selection of reply queue > > > > From 84676c1f21 (genirq/affinity: assign vectors to all possible CPUs), > one > > msix vector can be created without any online CPU mapped, then command > > may be queued, and won't be notified after its completion. > > > > This patch setups mapping between cpu and reply queue according to irq > > affinity info retrived by pci_irq_get_affinity(), and uses this info to > choose > > reply queue for queuing one command. > > > > Then the chosen reply queue has to be active, and fixes IO hang caused > by > > using inactive reply queue which doesn't have any online CPU mapped. > > Also megaraid FW will use reply queue 0 for any async notification. We > want to set pre_vectors = 1 and make sure reply queue 0 is not part of > affinity hint. > To meet that requirement, I have to make some more changes like add extra > queue. > Example if reply queue supported by FW is 96 and online CPU is 16, current > driver will allocate 16 msix vector. We may have to allocate 17 msix > vector and reserve reply queue 0 for async reply from FW. > > I will be sending follow up patch soon. OK, but the above extra change shouldn't belong to this patch, which focuses on fixing IO hang because of reply queue selection. > > > > > Cc: Hannes Reinecke <hare@xxxxxxx> > > Cc: "Martin K. Petersen" <martin.petersen@xxxxxxxxxx>, > > Cc: James Bottomley <james.bottomley@xxxxxxxxxxxxxxxxxxxxx>, > > Cc: Christoph Hellwig <hch@xxxxxx>, > > Cc: Don Brace <don.brace@xxxxxxxxxxxxx> > > Cc: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx> > > Cc: Laurence Oberman <loberman@xxxxxxxxxx> > > Cc: Mike Snitzer <snitzer@xxxxxxxxxx> > > Cc: Meelis Roos <mroos@xxxxxxxx> > > Cc: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxx> > > Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible > CPUs") > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> > > --- > > drivers/scsi/megaraid/megaraid_sas.h | 2 +- > > drivers/scsi/megaraid/megaraid_sas_base.c | 34 > > ++++++++++++++++++++++++++++- > > drivers/scsi/megaraid/megaraid_sas_fusion.c | 12 ++++------ > > 3 files changed, 38 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h > > b/drivers/scsi/megaraid/megaraid_sas.h > > index ba6503f37756..a644d2be55b6 100644 > > --- a/drivers/scsi/megaraid/megaraid_sas.h > > +++ b/drivers/scsi/megaraid/megaraid_sas.h > > @@ -2127,7 +2127,7 @@ enum MR_PD_TYPE { > > #define MR_NVME_PAGE_SIZE_MASK 0x000000FF > > > > struct megasas_instance { > > - > > + unsigned int *reply_map; > > __le32 *producer; > > dma_addr_t producer_h; > > __le32 *consumer; > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c > > b/drivers/scsi/megaraid/megaraid_sas_base.c > > index a71ee67df084..065956cb2aeb 100644 > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c > > @@ -5165,6 +5165,26 @@ megasas_setup_jbod_map(struct > > megasas_instance *instance) > > instance->use_seqnum_jbod_fp = false; } > > > > +static void megasas_setup_reply_map(struct megasas_instance *instance) > > +{ > > + const struct cpumask *mask; > > + unsigned int queue, cpu; > > + > > + for (queue = 0; queue < instance->msix_vectors; queue++) { > > + mask = pci_irq_get_affinity(instance->pdev, queue); > > + if (!mask) > > + goto fallback; > > + > > + for_each_cpu(cpu, mask) > > + instance->reply_map[cpu] = queue; > > + } > > + return; > > + > > +fallback: > > + for_each_possible_cpu(cpu) > > + instance->reply_map[cpu] = 0; > > Fallback should be better instead of just assigning to single reply queue. > May be something like below. > > for_each_possible_cpu(cpu) > instance->reply_map[cpu] = cpu % instance->msix_vectors;; > > If smp_affinity_enable module parameter is set to 0, I see performance > drop because IO is going to single reply queue. OK, that looks one issue. If smp_affinity_enable is set as 0, we should follow the original way. > > > +} > > + > > /** > > * megasas_init_fw - Initializes the FW > > * @instance: Adapter soft state > > @@ -5343,6 +5363,8 @@ static int megasas_init_fw(struct megasas_instance > > *instance) > > goto fail_setup_irqs; > > } > > > > + megasas_setup_reply_map(instance); > > + > > dev_info(&instance->pdev->dev, > > "firmware supports msix\t: (%d)", fw_msix_count); > > dev_info(&instance->pdev->dev, > > @@ -6448,6 +6470,11 @@ static int megasas_probe_one(struct pci_dev > > *pdev, > > memset(instance, 0, sizeof(*instance)); > > atomic_set(&instance->fw_reset_no_pci_access, 0); > > > > + instance->reply_map = kzalloc(sizeof(unsigned int) * nr_cpu_ids, > > + GFP_KERNEL); > > + if (!instance->reply_map) > > + goto fail_alloc_reply_map; > > + > We have common function megasas_alloc_ctrl_mem() to manage allocation. > Good candidate to move this allocation to megasas_alloc_ctrl_mem. Good point, will do that in V5. -- Ming