On Tue, 2019-10-08 at 08:21 +0200, Hannes Reinecke wrote: > On 10/7/19 10:45 PM, Ewan D. Milne wrote: > > > > The patch itself looks OK, but I was wondering about a couple of things: > > > > - There are other places in scsi_dh_alua where the ASC/ASCQ 04 0A is checked > > and we retry, I understand that this is a particular case you are solving > > but is the changing of the state to -> transitioning (because that's what > > the device said the state was) applicable in those other cases? > > No. The original code was built around the assumption that RTPG would > return the status of the device; consequently we would have to retry > RTPG until we get a final status. But as mentioned, there are arrays > which cannot return RTPG data during transitioning, so the code would > never be able to detect a transitioning state. > With this patch we set the state directly once the said sense code is > received. > But this applies _only_ to the RTPG command, as this is required to move > the state machine along. > None of the other commands are affected. > > > - The code originally seems to have been under the assumption that the > > transitioning state was a transient event, so the retry would pick up > > the eventual state. Now, some storage arrays spend a long time in the > > transitioning state, but if we don't send another command are we going to > > get the sense (or the UA) that triggers entry to the eventual ALUA state? > > > > Note, there are two types of retries. > The one is the 'normal' command retry, where we resend a command a given > number of times to retrieve the final status. > This is precisely the error which caused this patch. > > And then there is a scheduled retry; here we essentially poll the array > with sending RTPG in regular intervals until the 'transitioning' state > is gone. (Check for 'alua_rtpg()' and the handling of the SCSI_DH_RETRY > return value). With the patch we continue to trigger that second type of > retries, which will eventually clear the transitioning state. > > Cheers, > > Hannes Thanks for the explanation. The patch looks good. Reviewed-by: Ewan D. Milne <emilne@xxxxxxxxxx>