Re: [PATCH 03/24] zfcp: open-code fc_block_scsi_eh() for host reset

Steffen Maier <maier@xxxxxxxxxxxxx> · Wed, 4 May 2022 13:40:37 +0200

On 5/4/22 12:46, Steffen Maier wrote:
On 5/3/22 22:06, Hannes Reinecke wrote:
When issuing a host reset we should be waiting for all
ports to become unblocked; just waiting for one might
be resulting in host reset to return too early.

Signed-off-by: Hannes Reinecke <hare@xxxxxxx>
Reviewed-by: Christoph Hellwig <hch@xxxxxx>
Cc: Steffen Maier <maier@xxxxxxxxxxxxx>
Cc: Benjamin Block <bblock@xxxxxxxxxxxxx>
---
  drivers/s390/scsi/zfcp_scsi.c | 21 +++++++++++++++------
  1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index 526ac240d9fe..2e615e1dcde6 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -373,9 +373,11 @@ static int zfcp_scsi_eh_target_reset_handler(struct 
scsi_cmnd *scpnt)
  static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
  {
-    struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device);
-    struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
-    int ret = SUCCESS, fc_ret;
+    struct Scsi_Host *host = scpnt->device->host;
+    struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0];
+    int ret = SUCCESS;
+    unsigned long flags;
+    struct zfcp_port *port;
      if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE)) {
          zfcp_erp_port_forced_reopen_all(adapter, 0, "schrh_p");
@@ -383,9 +385,16 @@ static int zfcp_scsi_eh_host_reset_handler(struct 
scsi_cmnd *scpnt)
      }
      zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
      zfcp_erp_wait(adapter);

Now internal zfcp recovery for the adapter completed, but there are async 
pending follow-up things, such as rport(s) unblock, see below.

-    fc_ret = fc_block_scsi_eh(scpnt);
-    if (fc_ret)
-        ret = fc_ret;
+
+    spin_lock_irqsave(&adapter->port_list_lock, flags);

zfcp_adapter.port_list_lock is of type rwlock_t:

+    read_lock_irqsave(&adapter->port_list_lock, flags);

+    list_for_each_entry(port, &adapter->port_list, list) {
+        struct fc_rport *rport = port->rport;

While an rport is blocked, port->rport == NULL [zfcp_scsi_rport_block() and
zfcp_scsi_rport_register()], so this could be a NULL pointer deref.

Typically all rports would still be blocked after above adapter recovery, until 
they become unblocked (via zfcp's async rport_work) 
[zfcp_erp_try_rport_unblock() near the end of lun and port recovery that happen 
as follow-up parts of adapter recovery and before zfcp_erp_wait() returns; it 
eventually calls zfcp_scsi_schedule_rport_register() queueing work item 
port->rport_work on which we could do flush_work() [one prior art in 
zfcp_init_device_configure()]].

Am starting to wonder if we would really need to sync with the async rports 
unblocks (for this adapter) after zfcp_erp_wait() above and before any deref 
port->rport. Either within this loop or with a separate loop before this one.
( Another option would be to somehow iterate rports differently via common code 
lists so we directly get rport references. )

+
+        ret = fc_block_rport(rport);

I think I just noticed that that callee includes an msleep but we hold a read 
lock on port_list_lock. That does not seem right [scheduling while atomic?!]. 
Maybe we would need an open coded version after all to be able to drop both 
locks in correct order (and re-acquire) in the open-coded copy of fc_block_rport()?


Lock order looks good, we hold port_list_lock here and fc_block_rport() takes 
Scsi_Host host_lock, so it matches prior art:
static void zfcp_erp_action_dismiss_adapter(struct zfcp_adapter *adapter)
         read_lock(&adapter->port_list_lock);
             zfcp_erp_action_dismiss_port(port);
             spin_lock(port->adapter->scsi_host->host_lock);

+        if (ret)
+            break;

So once we find the first rport with
(rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT)
we would break the loop and return from zfcp_scsi_eh_host_reset_handler() with 
FAST_IO_FAIL. Is that correct? What if there are still other blocked rports if 
we were to continue with the loop? Or am I missing something regarding the goal 
of "waiting for all rports to become unblocked"?

Once we completed the loop, the question arises which consolidated return code 
would be the correct one, if we got different ret for different rports in the 
loop. Does my previous brain dump make sense?:
I was pondering in my own patch version what to return of just a subset
of ports ran into fast_io_fail. And for now I thought just fast_io_fail
would be good to let I/O bubble up for path failover, even if this would
include of rport which meanwhile unblocked properly and would not need
bubbling up pending requests because they could service them with a
simple retry.

+    }
+    spin_unlock_irqrestore(&adapter->port_list_lock, flags);

+    read_unlock_irqrestore(&adapter->port_list_lock, flags);

      zfcp_dbf_scsi_eh("schrh_r", adapter, ~0, ret);
      return ret;




--
Mit freundlichen Gruessen / Kind regards
Steffen Maier

Linux on IBM Z and LinuxONE

https://www.ibm.com/privacy/us/en/
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschaeftsfuehrung: David Faller
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294