Re: [PATCH v3 21/26] lpfc: Rework locking on SCSI io completion

Hannes Reinecke <hare@xxxxxxx> · Sat, 26 Jan 2019 10:43:31 +0100

On 1/26/19 2:18 AM, James Smart wrote:
A scsi host lock is taken on every io completion to check whether
the abort handler is waiting on the io completion. This is an
expensive lock to take on all completion when rarely in an abort
condition.

Replace scsi host lock with command-specific lock. Synchronize
completion and abort paths by new cmd lock. Ensure all flag
changing and nulling of context pointers taken under lock.
When adding lock to task management abort, realized it was
missing other synchronization locks. Added that synchronization
to match normal paths.

Signed-off-by: Dick Kennedy <dick.kennedy@xxxxxxxxxxxx>
Signed-off-by: James Smart <jsmart2021@xxxxxxxxx>
---
  drivers/scsi/lpfc/lpfc_init.c |  1 +
  drivers/scsi/lpfc/lpfc_nvme.c | 43 ++++++++++++--------
  drivers/scsi/lpfc/lpfc_scsi.c | 92 +++++++++++++++++++++----------------------
  drivers/scsi/lpfc/lpfc_sli.c  | 42 +++++++++++++++-----
  drivers/scsi/lpfc/lpfc_sli.h  |  1 +
  5 files changed, 107 insertions(+), 72 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 7ee9120b10f6..bcae215cd18c 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -4160,6 +4160,7 @@ lpfc_new_io_buf(struct lpfc_hba *phba, int num_to_alloc)
  		lpfc_ncmd->dma_sgl = lpfc_ncmd->data;
  		lpfc_ncmd->dma_phys_sgl = lpfc_ncmd->dma_handle;
  		lpfc_ncmd->cur_iocbq.context1 = lpfc_ncmd;
+		spin_lock_init(&lpfc_ncmd->buf_lock);
  
  		/* add the nvme buffer to a post list */
  		list_add_tail(&lpfc_ncmd->list, &post_nblist);
diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c
index 9480257c5143..4b19be386b7b 100644
--- a/drivers/scsi/lpfc/lpfc_nvme.c
+++ b/drivers/scsi/lpfc/lpfc_nvme.c
@@ -969,15 +969,19 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn,
  	uint32_t *ptr;
  
  	/* Sanity check on return of outstanding command */
-	if (!lpfc_ncmd || !lpfc_ncmd->nvmeCmd) {
-		if (!lpfc_ncmd) {
-			lpfc_printf_vlog(vport, KERN_ERR,
-					 LOG_NODE | LOG_NVME_IOERR,
-					 "6071 Null lpfc_ncmd pointer. No "
-					 "release, skip completion\n");
-			return;
-		}
+	if (!lpfc_ncmd) {
+		lpfc_printf_vlog(vport, KERN_ERR,
+				 LOG_NODE | LOG_NVME_IOERR,
+				 "6071 Null lpfc_ncmd pointer. No "
+				 "release, skip completion\n");
+		return;
+	}
  
+	/* Guard against abort handler being called at same time */
+	spin_lock(&lpfc_ncmd->buf_lock);
+
+	if (!lpfc_ncmd->nvmeCmd) {
+		spin_unlock(&lpfc_ncmd->buf_lock);
  		lpfc_printf_vlog(vport, KERN_ERR, LOG_NODE | LOG_NVME_IOERR,
  				 "6066 Missing cmpl ptrs: lpfc_ncmd %p, "
  				 "nvmeCmd %p\n",
@@ -1157,6 +1161,7 @@ lpfc_nvme_io_cmd_wqe_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pwqeIn,
  		nCmd->done(nCmd);
  		lpfc_ncmd->nvmeCmd = NULL;
  	}
+	spin_unlock(&lpfc_ncmd->buf_lock);
  
  	/* Call release with XB=1 to queue the IO into the abort list. */
  	lpfc_release_nvme_buf(phba, lpfc_ncmd);
Hmm. Wouldn't it be better here to set lpfc_ncmd->nvmeCmd to NULL first, 
then release the lock, and _then_ call nCmd->done()?
With the current code there might be a risk of accidental command 
starvation, as ->done() is called before the command itself is being 
released, hence the old command will not be available for re-use after 
the call to ->done().

Otherwise it looks okay.

Cheers,

Hannes