On Mon, 2022-01-24 at 12:19 +0100, Martin Wilck wrote: > > My late testing has revealed an issue with this patch with explicit > ALUA. It's similar to what you solved with the "ghost_delay" > parameter > in the past. > > With this patch, multipathd now starts before SCSI device detection > begins, and as soon as multipathd sets up a map, I/O on this map may > be > started. With arrays supporting Active/optimized and Active/non- > optimized states and explicit ALUA, this causes unnecessary path > state > switching if paths in non-optimized state are detected before > optimized > ones. I/O will cause scsi_dh_activate() to be called in the kernel, > and > this will run an STPG, which always uses active/optimized as target > state. > > With RDDAC, we'll have a similar problem. The other device handlers > don't distinguish active and optimal states, AFAICS. > > I fear this behavior will not be welcome in some configurations. So > far > I haven't made up my mind how, and if at all, we can fix it. I > suppose > something similar to ghost_delay would be possible on the multipath- > tools side, but it's not straightforward, because non-optimized paths > simply count as PATH_UP in multipathd. Also, the delay should > probably > be much shorter than for PATH_GHOST. In my testing against a LIO > target, it was a matter of milliseconds which path would appear > first. > > Alternatively, maybe we can consider the way scsi_dh_activate() > works? > Perhaps it doesn't have to switch from active/non-optimized to > active/optimized state? OTOH, there are other situation (explicit > path > group switch) where we'd want exactly that. In discussion with Hannes, we came to the conclusion: - for ALUA, the effect mentioned in my post can be avoided using the kernel parameter "scsi_dh_alua.optimize_stpg=1". Confirmed by testing. - even if this parameter is not used, spurious switching between non-optimized and optimized state is non fatal, and much less resource-intensive on the storage side than switching between active and standby states. So, it's not a big issue, after all... > The other alternative would be waiting for udev settle again. I'd > really like to avoid that. ... and this won't be necessary. Martin -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel