On Fri, 2022-05-20 at 19:52 -0700, Brian Bunker wrote: > From my perspective, the ALUA transitioning state is a temporary > state > where the target is saying that it does not have a permanent state > yet. Having the initiator try another pg to me doesn't seem like the > right thing for it to do. I agree. Unfortunately, there's no logic in dm-multipath saying "I may switch paths inside a PG, but I may not do PG failover". > If the target wanted the initiator to use a > different pg, it should use an ALUA state which would make that > clear, > standby, unavailable, etc. The target would only return an error > state > if it was aware that some other path is in an active state.When > transitioning is returned, I don't think the initiator should assume > that any other pg would be a better choice. I think it should assume > that the target will make its intention clear for that path with a > permanent state within a transition timeout. For me the question is still whether trying to send I/O to the path that is known not to be able to process it makes sense. As noted elsewhere, you patch just delays the BLK_STS_AGAIN by a few milliseconds. You want to avoid a PG switch, and I second that, but IMO that needs a different approach. > From my perspective the right thing to do is to let the ALUA handler > do what it is trying to do. If the pg state is transitioning and > within the transition timeout it should continue to retry that > request > checking each time the transition timeout. But this means that we should modify the logic not only in alua_prep_fn() but also for handling of NOT READY conditions, either in alua_check_sense() or in scsi_io_completion_action(). I agree that this would make a lot of sense, perhaps more than trying to implement a cleverer logic in dm-multipath as discussed between Hannes and myself. This is what we need to figure out first: Do we want to change the logic in the multipath layer, making it somehow aware of the special nature of "transitioning" state, or should we tune the retry logic in the SCSI layer such that dm-multipath will "do the right thing" automatically? Regards Martin