On Wed, 2021-11-17 at 15:21 -0600, Benjamin Marzinski wrote: > So, it turns out that commit 4ef67017 "libmultipath: add > update_pathvec_from_dm()" already does most of the hard work of > making > multipath handle the uninitialized paths that exist during boot, > after > the switch root, but before the all the coldplug uevents have been > processed. The only problem is that this code leaves the paths in a > partially initialized state, which doesn't completely get corrected > until a reconfigure happens. > > [...] > > I've tested these patches both by rebooting with necessary and > unnecessary multipath devices in the initramfs and multipathd.service > set to make multipathd start up at various points after the switch > root, > and by manually sending remove uevents to unintialize some paths, and > then starting multipathd to test specific combinations of path > states. > But more testing is always welcome. My late testing has revealed an issue with this patch with explicit ALUA. It's similar to what you solved with the "ghost_delay" parameter in the past. With this patch, multipathd now starts before SCSI device detection begins, and as soon as multipathd sets up a map, I/O on this map may be started. With arrays supporting Active/optimized and Active/non- optimized states and explicit ALUA, this causes unnecessary path state switching if paths in non-optimized state are detected before optimized ones. I/O will cause scsi_dh_activate() to be called in the kernel, and this will run an STPG, which always uses active/optimized as target state. With RDDAC, we'll have a similar problem. The other device handlers don't distinguish active and optimal states, AFAICS. I fear this behavior will not be welcome in some configurations. So far I haven't made up my mind how, and if at all, we can fix it. I suppose something similar to ghost_delay would be possible on the multipath- tools side, but it's not straightforward, because non-optimized paths simply count as PATH_UP in multipathd. Also, the delay should probably be much shorter than for PATH_GHOST. In my testing against a LIO target, it was a matter of milliseconds which path would appear first. Alternatively, maybe we can consider the way scsi_dh_activate() works? Perhaps it doesn't have to switch from active/non-optimized to active/optimized state? OTOH, there are other situation (explicit path group switch) where we'd want exactly that. The other alternative would be waiting for udev settle again. I'd really like to avoid that. Ideas and thoughts highly welcome. Regards, Martin -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel