In a controller failover do not fail paths that are transitioning or an unexpected I/O error will return when accessing a multipath device. Consider this case, a two controller array with paths coming from a primary and a secondary controller. During any upgrade there will be a transition from a secondary to a primary state. 1. In the beginning all paths active / optimized: 3624a9370602220bd9986439b00012c1e dm-12 PURE,FlashArray size=3.0T features='0' hwhandler='1 alua' wp=rw `-+- policy='service-time 0' prio=50 status=active |- 13:0:0:11 sdch 69:80 active ready running |- 14:0:0:11 sdce 69:32 active ready running |- 15:0:0:11 sdcf 69:48 active ready running |- 1:0:0:11 sdci 69:96 active ready running |- 9:0:0:11 sdck 69:128 active ready running |- 10:0:0:11 sdcg 69:64 active ready running |- 11:0:0:11 sdcd 69:16 active ready running `- 12:0:0:11 sdcj 69:112 active ready running CT0 paths - sdce, sdcf, sdcg, sdcd CT1 paths - sdch, sdci, sdck, sdcj 2. Run I/O to the multipath device: [root@init115-15 ~]# /opt/Purity/bin/bb/pureload -m initthreads=32 /dev/dm-12 Thu Jul 8 13:33:47 2021: /opt/Purity/bin/bb/pureload num_cpus = 64 Thu Jul 8 13:33:47 2021: /opt/Purity/bin/bb/pureload num numa nodes 2 Thu Jul 8 13:33:47 2021: /opt/Purity/bin/bb/pureload Starting test with 32 threads 3. In an upgrade the primary controller is failed and the secondary controller transitions to primary. From an ALUA paths perspective, the paths to the previous primary go to ALUA state unavailable while the paths transitioning to the promoting primary move to ALUA state transitioning. It is expected that 4 paths will fail: Jul 8 13:33:58 init115-15 kernel: sd 14:0:0:11: [sdce] tag#1178 Add. Sense: Logical unit not accessible, target port in unavailable state Jul 8 13:33:58 init115-15 kernel: sd 15:0:0:11: [sdcf] tag#1374 Add. Sense: Logical unit not accessible, target port in unavailable state Jul 8 13:33:58 init115-15 kernel: sd 10:0:0:11: [sdcg] tag#600 Add. Sense: Logical unit not accessible, target port in unavailable state Jul 8 13:33:58 init115-15 kernel: sd 11:0:0:11: [sdcd] tag#1460 Add. Sense: Logical unit not accessible, target port in unavailable state Jul 8 13:33:58 init115-15 kernel: device-mapper: multipath: 253:12: Failing path 69:64. Jul 8 13:33:58 init115-15 kernel: device-mapper: multipath: 253:12: Failing path 69:48. Jul 8 13:33:58 init115-15 kernel: device-mapper: multipath: 253:12: Failing path 69:16. Jul 8 13:33:58 init115-15 kernel: device-mapper: multipath: 253:12: Failing path 69:32. Jul 8 13:33:58 init115-15 multipathd[46030]: 3624a9370602220bd9986439b00012c1e: remaining active paths: 7 Jul 8 13:33:58 init115-15 multipathd[46030]: 3624a9370602220bd9986439b00012c1e: remaining active paths: 6 Jul 8 13:33:59 init115-15 multipathd[46030]: 3624a9370602220bd9986439b00012c1e: remaining active paths: 5 Jul 8 13:33:59 init115-15 multipathd[46030]: 3624a9370602220bd9986439b00012c1e: remaining active paths: 4 4. It is not expected that the remaining 4 paths will also fail. This was not the case until the change which introduced BLK_STS_AGAIN into the SCSI ALUA device handler. With that change new I/O which reaches that handler on paths that are in ALUA state transitioning will result in those paths failing. Previous Linux versions, before that change, will not return an I/O error back to the client application. Similarly, this problem does not happen in other operating systems, e.g. ESXi, Windows, AIX, etc. 5. It is not expected that the paths to the promoting primary fail yet they do: Jul 8 13:33:59 init115-15 kernel: device-mapper: multipath: 253:12: Failing path 69:96. Jul 8 13:33:59 init115-15 kernel: device-mapper: multipath: 253:12: Failing path 69:112. Jul 8 13:33:59 init115-15 kernel: device-mapper: multipath: 253:12: Failing path 69:80. Jul 8 13:33:59 init115-15 kernel: device-mapper: multipath: 253:12: Failing path 69:128. Jul 8 13:33:59 init115-15 multipath[53813]: dm-12: no usable paths found Jul 8 13:33:59 init115-15 multipath[53833]: dm-12: no usable paths found Jul 8 13:33:59 init115-15 multipath[53853]: dm-12: no usable paths found Jul 8 13:33:59 init115-15 multipathd[46030]: 3624a9370602220bd9986439b00012c1e: remaining active paths: 3 Jul 8 13:33:59 init115-15 multipathd[46030]: 3624a9370602220bd9986439b00012c1e: remaining active paths: 2 Jul 8 13:33:59 init115-15 multipathd[46030]: 3624a9370602220bd9986439b00012c1e: remaining active paths: 1 Jul 8 13:33:59 init115-15 multipathd[46030]: 3624a9370602220bd9986439b00012c1e: remaining active paths: 0 6. The error gets back to the user of the muitipath device unexpectedly: Thu Jul 8 13:33:59 2021: /opt/Purity/bin/bb/pureload I/O Error: io 43047 fd 36 op read offset 00000028ef7a7000 size 4096 errno 11 rsize -1 The earlier patch I made for this was not desirable, so I am proposing this much smaller patch which will similarly not allow the transitioning paths to result in immediate failure. Signed-off-by: Brian Bunker <brian@xxxxxxxxxxxxxxx> Acked-by: Krishna Kant <krishna.kant@xxxxxxxxxxxxxxx> Acked-by: Seamus Connor <sconnor@xxxxxxxxxxxxxxx> ____ diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index bced42f082b0..d5d6be96068d 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -1657,7 +1657,7 @@ static int multipath_end_io(struct dm_target *ti, struct request *clone, else r = DM_ENDIO_REQUEUE; - if (pgpath) + if (pgpath && (error != BLK_STS_AGAIN)) fail_path(pgpath); if (!atomic_read(&m->nr_valid_paths) && Brian Bunker SW Eng brian@xxxxxxxxxxxxxxx