Re: [RFC][PATCH 2/6] fnic: add fnic_scsi.c and fnic_io.h.

Mike Christie <michaelc@xxxxxxxxxxx> · Mon, 25 Aug 2008 14:31:17 -0500

James Smart wrote:

Mike Christie wrote:
jeykholt@xxxxxxxxx wrote:
fnic: add fnic_scsi.c and fnic_io.h.

fnic_scsi.c contains the FCP SCSI handling as well as firmware reset
and FLOGI registration handling.

I just looked at this one function, because I was fixing the same code
in fcoe.ko/libfc:fc_fcp.c

+int fnic_reset(struct Scsi_Host *shost)
+{
+     struct fc_lport *lp;
+     struct fnic *fnic;
+     unsigned long flags;
+     int ret = SUCCESS;
+     enum fnic_state old_state;
+     DECLARE_COMPLETION_ONSTACK(reset_wait);
+
+     lp = shost_priv(shost);
+     fnic = lp->drv_priv;
+
+     printk(KERN_DEBUG DFX "fnic_reset called\n", fnic->fnic_no);
+
+     /* Issue firmware reset */
+     spin_lock_irqsave(&fnic->fnic_lock, flags);
+     fnic->reset_wait = &reset_wait;
+     old_state = fnic->state;
+     fnic->state = FNIC_IN_FC_TRANS_ETH_MODE;
+     vnic_dev_del_addr(fnic->vdev, fnic->data_src_addr);
+     spin_unlock_irqrestore(&fnic->fnic_lock, flags);
+
+     if (fnic_fw_reset_handler(fnic)) {
+             spin_lock_irqsave(&fnic->fnic_lock, flags);
+             ret = FAILED;
+             if (fnic->state == FNIC_IN_FC_TRANS_ETH_MODE)
+                     fnic->state = old_state;
+             fnic->reset_wait = NULL;
+             spin_unlock_irqrestore(&fnic->fnic_lock, flags);
+             goto fnic_reset_end;
+     }
+
+     /* fw reset is issued, now wait for it to complete */
+     wait_for_completion_timeout(&reset_wait,
+                                 
msecs_to_jiffies(FNIC_HOST_RESET_TIMEOUT));
+
+     /* Check for status */
+     spin_lock_irqsave(&fnic->fnic_lock, flags);
+     fnic->reset_wait = NULL;
+     ret = (fnic->state == FNIC_IN_ETH_MODE) ? SUCCESS : FAILED;
+     spin_unlock_irqrestore(&fnic->fnic_lock, flags);
+
+     /* Now reset local port, this will clean up libFC exchanges,
+      * reset remote port sessions, and if link is up, begin flogi
+      */
+     fc_lport_lock(lp);
+     if (lp->tt.lport_reset(lp))
+             ret = FAILED;

The problem here is that this only starts the login. When fnic_reset
returns for the scsi eh path, scsi-ml is going to send a TUR to make
sure that we are ready to go. If we are not the devices will be
offlined. So unless it is a really quick relogin we are going to offline
the devices by accident.

Well - what should be happening is - prior to the reset or as part of 
it, the fc transport fc_remote_port_delete() call should be made on all 
those remote ports that connectivity is about to be terminated on.  This 
will place all the associated targets/luns on those rports into a 
blocked state, and start the devloss timer on them.  This will suspend 
the eh path as well.  Thus, things suspend until either the driver/fcoe 

What do you mean by that? For lpfc it will or for this driver? This 
driver does not have that block call like lpfc_block_error_handler, so 
if the rport event occurs after the scsi eh is running we do not suspend 
the eh.

So below I am saying we should make the lpfc_block_error_handler 
functionality and the equivalent in the qla2xxx and mpfc common so 
libfc/fcoe and fnic can use it.

stack re-login's and re-calls the transport with the same remote port 
(thus the transport will unblock the targets/luns), or devloss_tmo 
expires, at which time it is correct to report loss of connectivity.

Of course, all this assumes the fc_host stays in existence.

For fc_fcp.c I added a hokey loop and wait like some other drivers. We
could instead have libfc notify any waiters of a state change here. We
could also do a rport blocked timedout helper, convert the fc drivers
and use it here so we only wait for the login to complete or for the
port block to fail.

-- james s

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html