On Tue, 2010-09-28 at 00:04 -0500, Mike Christie wrote: > On 09/27/2010 09:06 PM, Nicholas A. Bellinger wrote: > > From: Nicholas Bellinger<nab@xxxxxxxxxxxxxxx> > > > > This patch removes the now legacy host_lock unlock() + lock() optimization > > from lpfc_scsi.c:lpfc_queuecommand(). This also includes setting the > > SHT->unlocked_qcmd=1 for host_lock less lpfc lpfc_queuecommand() operation. > > > > Signed-off-by: Nicholas A. Bellinger<nab@xxxxxxxxxxxxxxx> > > --- > > drivers/scsi/lpfc/lpfc_scsi.c | 4 ++-- > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c > > index 2e51aa6..69fe31e 100644 > > --- a/drivers/scsi/lpfc/lpfc_scsi.c > > +++ b/drivers/scsi/lpfc/lpfc_scsi.c > > @@ -3023,11 +3023,9 @@ lpfc_queuecommand(struct scsi_cmnd *cmnd, void (*done) (struct scsi_cmnd *)) > > goto out_host_busy_free_buf; > > } > > if (phba->cfg_poll& ENABLE_FCP_RING_POLLING) { > > - spin_unlock(shost->host_lock); > > lpfc_sli_handle_fast_ring_event(phba, > > &phba->sli.ring[LPFC_FCP_RING], HA_R0RE_REQ); > > > > - spin_lock(shost->host_lock); > > if (phba->cfg_poll& DISABLE_FCP_RING_INT) > > lpfc_poll_rearm_timer(phba); > > } > > @@ -3723,6 +3721,7 @@ struct scsi_host_template lpfc_template = { > > .slave_destroy = lpfc_slave_destroy, > > .scan_finished = lpfc_scan_finished, > > .this_id = -1, > > + .unlocked_qcmd = 1, > > .sg_tablesize = LPFC_DEFAULT_SG_SEG_CNT, > > .cmd_per_lun = LPFC_CMD_PER_LUN, > > .use_clustering = ENABLE_CLUSTERING, > > @@ -3746,6 +3745,7 @@ struct scsi_host_template lpfc_vport_template = { > > .slave_destroy = lpfc_slave_destroy, > > .scan_finished = lpfc_scan_finished, > > .this_id = -1, > > + .unlocked_qcmd = 1, > > .sg_tablesize = LPFC_DEFAULT_SG_SEG_CNT, > > .cmd_per_lun = LPFC_CMD_PER_LUN, > > .use_clustering = ENABLE_CLUSTERING, > > The FC class sets the rport state and bits with the host lock held. > Drivers were then calling fc_remote_port_chkready from the queuecommand > with the host lock held. If we remove the host lock from queuecommand is > it possible that the on one proc the fc class calls fc_remote_port_add > to re-add a rport, this sets the rport state to online, it unblocks the > devices, but then on some other processor we start calling queuecommand > and see that the rport is not online (maybe blocked with > FC_RPORT_FAST_FAIL_TIMEDOUT set) and so we end up failing the IO? Hey Mike, So it sounds like we have two options here: *) Add a per struct fc_rport lock to protect rport->port_state in fc_remote_port_chkready() (and other places..?) that assume they will be held under host_lock. Unfortuately fc_remote_port_chkready() does not mention the hard requirement for host_lock held usage, so I assume other callers will not either.. :-( *) Drop the lockless ->queuecommand() patches for LLD users of fc_remote_port_chkready() for now and use the legacy ->queuecommand() -> unlock -> do_lld_work() -> lock optimization. Here is what that list currently looks like in drivers/scsi: drivers/scsi/bfa/bfad_im.c: if (!rport || fc_remote_port_chkready(rport)) drivers/scsi/bfa/bfad_im.c: rc = fc_remote_port_chkready(rport); drivers/scsi/fnic/fnic_main.c: if (!rport || fc_remote_port_chkready(rport)) drivers/scsi/fnic/fnic_scsi.c: ret = fc_remote_port_chkready(rport); drivers/scsi/fnic/fnic_scsi.c: if (fc_remote_port_chkready(rport) == 0) drivers/scsi/fnic/fnic_scsi.c: if (fc_remote_port_chkready(rport)) drivers/scsi/ibmvscsi/ibmvfc.c: if (unlikely((rc = fc_remote_port_chkready(rport))) || drivers/scsi/ibmvscsi/ibmvfc.c: if (unlikely(rc || (rport && (rc = fc_remote_port_chkready(rport)))) || drivers/scsi/ibmvscsi/ibmvfc.c: if (!rport || fc_remote_port_chkready(rport)) drivers/scsi/libfc/fc_fcp.c: rval = fc_remote_port_chkready(rport); drivers/scsi/libfc/fc_fcp.c: rval = fc_remote_port_chkready(rport); drivers/scsi/libfc/fc_fcp.c: if (!rport || fc_remote_port_chkready(rport)) drivers/scsi/lpfc/lpfc_scsi.c: err = fc_remote_port_chkready(rport); drivers/scsi/lpfc/lpfc_scsi.c: if (!rport || fc_remote_port_chkready(rport)) drivers/scsi/qla2xxx/qla_os.c: rval = fc_remote_port_chkready(rport); drivers/scsi/qla2xxx/qla_os.c: if (!rport || fc_remote_port_chkready(rport)) drivers/scsi/scsi_transport_fc.c: * the port_state or flags, so that fc_remote_port_chkready will So what you think we should do here..? Also, does anyone know if any of the same type of host_lock held assumptions are also made by libsas and/or libata code..? Best, --nab -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html