On Thu, 2017-04-27 at 15:08 -0700, jsmart2021@xxxxxxxxx wrote: > From: James Smart <jsmart2021@xxxxxxxxx> > > To select the appropriate shost template, the driver is issuing > a mailbox command to retrieve the wwn. Turns out the sending of > the command precedes the reset of the function. On SLI-4 adapters, > this is inconsequential as the mailbox command location is specified > by dma via the BMBX register. However, on SLI-3 adapters, the > location of the mailbox command submission area changes. When the > function is first powered on or reset, the cmd is submitted via PCI > bar memory. Later the driver changes the function config to use > host memory and DMA. The request to start a mailbox command is the > same, a simple doorbell write, regardless of submission area. > So.. if there has not been a boot driver run against the adapter, > the mailbox command works as defaults are ok. But, if the boot > driver has configured the card and, and if no platform pci > function/slot reset occurs as the os starts, the mailbox command > will fail. The SLI-3 device will use the stale boot driver dma > location. This can cause PCI eeh errors. > > Fix is to reset the sli-3 function before sending the > mailbox command, thus synchronizing the function/driver on mailbox > location. > > Note: The fix uses routines that are typically invoked later in the > call flow to reset the sli-3 device. The issue in using those routines is > that the normal (non-fix) flow does additional initialization, namely the > allocation of the pport structure. So, rather than significantly reworking > the initialization flow so that the pport is alloc'd first, pointer checks > are added to work around it. Checks are limited to the routines invoked > by a sli-3 adapter (s3 routines) as this fix/early call is only invoked > on a sli3 adapter. Nothing changes post the fix. Subsequent initialization, > and another adapter reset, still occur - both on sli-3 and sli-4 adapters. > > Signed-off-by: Dick Kennedy <dick.kennedy@xxxxxxxxxxxx> > Signed-off-by: James Smart <james.smart@xxxxxxxxxxxx> > Fixes: 96418b5e2c88 ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.") > Cc: stable@xxxxxxxxxxxxxxx > --- > The issue was introduced by one of the patches in the lpfc 11.2.0.10 > patch set: http://www.spinics.net/lists/linux-scsi/msg105908.html > As the set contained a number of NVME fixes, it was picked up only in > the linux-block tree. The scsi trees are at ~11.2.0.7. > > This patch was cut against the nvme-4.12 tree, which is backed by > linux-block. The patch must go in via the block tree. > > As the patch is scsi-specific, it was cherry-picked for stable trees. > Thus the fix needs to be picked up by them. > > Warning: the patch applies to pre-11.2.0.10 versions of lpfc (in the > scsi trees) reporting only fuzz warnings, but creates something > completely erroneous. > --- > drivers/scsi/lpfc/lpfc_crtn.h | 1 + > drivers/scsi/lpfc/lpfc_init.c | 7 +++++++ > drivers/scsi/lpfc/lpfc_sli.c | 19 ++++++++++++------- > 3 files changed, 20 insertions(+), 7 deletions(-) > > diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h > index 944b32c..1c55408 100644 > --- a/drivers/scsi/lpfc/lpfc_crtn.h > +++ b/drivers/scsi/lpfc/lpfc_crtn.h > @@ -294,6 +294,7 @@ int lpfc_selective_reset(struct lpfc_hba *); > void lpfc_reset_barrier(struct lpfc_hba *); > int lpfc_sli_brdready(struct lpfc_hba *, uint32_t); > int lpfc_sli_brdkill(struct lpfc_hba *); > +int lpfc_sli_chipset_init(struct lpfc_hba *phba); > int lpfc_sli_brdreset(struct lpfc_hba *); > int lpfc_sli_brdrestart(struct lpfc_hba *); > int lpfc_sli_hba_setup(struct lpfc_hba *); > diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c > index 90ae354..e85f273 100644 > --- a/drivers/scsi/lpfc/lpfc_init.c > +++ b/drivers/scsi/lpfc/lpfc_init.c > @@ -3602,6 +3602,13 @@ lpfc_get_wwpn(struct lpfc_hba *phba) > LPFC_MBOXQ_t *mboxq; > MAILBOX_t *mb; > > + if (phba->sli_rev < LPFC_SLI_REV4) { > + /* Reset the port first */ > + lpfc_sli_brdrestart(phba); > + rc = lpfc_sli_chipset_init(phba); > + if (rc) > + return (uint64_t)-1; > + } > > mboxq = (LPFC_MBOXQ_t *) mempool_alloc(phba->mbox_mem_pool, > GFP_KERNEL); > diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c > index cf19f49..2a4fc00 100644 > --- a/drivers/scsi/lpfc/lpfc_sli.c > +++ b/drivers/scsi/lpfc/lpfc_sli.c > @@ -4204,13 +4204,16 @@ lpfc_sli_brdreset(struct lpfc_hba *phba) > /* Reset HBA */ > lpfc_printf_log(phba, KERN_INFO, LOG_SLI, > "0325 Reset HBA Data: x%x x%x\n", > - phba->pport->port_state, psli->sli_flag); > + (phba->pport) ? phba->pport->port_state : 0, > + psli->sli_flag); > > /* perform board reset */ > phba->fc_eventTag = 0; > phba->link_events = 0; > - phba->pport->fc_myDID = 0; > - phba->pport->fc_prevDID = 0; > + if (phba->pport) { > + phba->pport->fc_myDID = 0; > + phba->pport->fc_prevDID = 0; > + } > > /* Turn off parity checking and serr during the physical reset */ > pci_read_config_word(phba->pcidev, PCI_COMMAND, &cfg_value); > @@ -4336,7 +4339,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba) > /* Restart HBA */ > lpfc_printf_log(phba, KERN_INFO, LOG_SLI, > "0337 Restart HBA Data: x%x x%x\n", > - phba->pport->port_state, psli->sli_flag); > + (phba->pport) ? phba->pport->port_state : 0, > + psli->sli_flag); > > word0 = 0; > mb = (MAILBOX_t *) &word0; > @@ -4350,7 +4354,7 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba) > readl(to_slim); /* flush */ > > /* Only skip post after fc_ffinit is completed */ > - if (phba->pport->port_state) > + if (phba->pport && phba->pport->port_state) > word0 = 1; /* This is really setting up word1 */ > else > word0 = 0; /* This is really setting up word1 */ > @@ -4359,7 +4363,8 @@ lpfc_sli_brdrestart_s3(struct lpfc_hba *phba) > readl(to_slim); /* flush */ > > lpfc_sli_brdreset(phba); > - phba->pport->stopped = 0; > + if (phba->pport) > + phba->pport->stopped = 0; > phba->link_state = LPFC_INIT_START; > phba->hba_flag = 0; > spin_unlock_irq(&phba->hbalock); > @@ -4446,7 +4451,7 @@ lpfc_sli_brdrestart(struct lpfc_hba *phba) > * iteration, the function will restart the HBA again. The function returns > * zero if HBA successfully restarted else returns negative error code. > **/ > -static int > +int > lpfc_sli_chipset_init(struct lpfc_hba *phba) > { > uint32_t status, i = 0; Reviewed-by: Ewan D. Milne <emilne@xxxxxxxxxx>