On Mon, 2021-11-08 at 11:30 -0600, Benjamin Marzinski wrote: > On Mon, Nov 08, 2021 at 04:55:37PM +0000, Martin Wilck wrote: > > On Mon, 2021-11-08 at 10:29 -0600, Benjamin Marzinski wrote: > > > On Fri, Nov 05, 2021 at 11:20:01PM +0000, Martin Wilck wrote: > > > > > 1: udev hasn't gotten an event for a device > > > > > > > > I don't think we can detect this without listening for kernel > > > > uevents. > > > > And even if we listen to them, we could have missed some. udevd > > > > doesn't > > > > have an API for it, either, AFAIK. > > > > > > Isn't this the most common INIT_PARTIAL case, where we are > > > waiting > > > for > > > the coldplug uevent? If there is no database file when we are > > > processing > > > the device, we are in this state. Correct? > > > > Not necessarily. udev may have got an event, but not have finished > > processing it, or failed to process it entirely (e.g. because of a > > timeout, your case 2.). When udevd sees an "add" or "change" event > > for > > a device for the first time, creating the db entry is the last > > action > > it takes. During coldplug, udevd will receive a lot of events > > almost > > simultaneously, but it may take considerable time until it has > > processed them all. > > Fair enough. So if the first uevent hasn't completed already > successfully, it's gonna be hard to know why. I just realized that what I said was wrong. The worker does this (udev v246): worker_process_device() udev_event_execute_rules() udev_rules_apply_to_event() update_devnode() /* symlinks etc */ device_ensure_usec_initialized() /* set USEC_INITIALIZED */ device_update_db() /* here the db file is (re)written */ /* from this point on, libudev will consider the device initialized */ update_devnode() /* again */ device_set_is_initialized() /* sets is_initialized property */ udev_event_execute_run() /* Here RUN is executed */ if (udev_event->inotify_watch) device_update_db() /* again, this is what I was referring to */ device_monitor_send_device() /* send "udev" uevent to listeners */ If a worker is killed because of timeout, or exits with error, udevd will receive SIGCHLD and delete the db entry for the device in question. > > > > > > 2: udev got an event but timed out or failed while processing > > > > > it > > > > > > > > This would be the USEC_INITIALIZED case, IMO > > > > > > If udev has, in the past, successfully processed an event for a > > > device, > > > but fails while processing a later uevent, it doesn't keep th > > > data > > > from > > > the previous event. So it could lose the uid_attribute. But the > > > database file should still exist. Correct? > > > > That's true. But we can't do anything about it. libudev will return > > what the database currently contains, which is the content from the > > last successfuly processed "add" or "change" uevent, whether or not > > other uevents are in the queue or being processed. I don't think > > this > > scenario matters in the current discussion about partially > > initialized > > paths. This is the "changed wwid" scenario which I think we handle > > quite nicely today already. Or am I misunderstanding? > > If both events occurred before multipathd started up, then this > wouldn't > simply be a "changed WWID". The hope is to be able to reliably > distinguish this from case 3, where the data from udev is fine, but > the > uid_attribute still isn't there. I'm not quite getting this. Are you talking about the case where one, or both, uevents are lost? IMO we can use udev_device_get_is_initialized() as test. AFAICS it's basically equivalent to USEC_INITIALIZED being set. If that function returns true and uid_attribute is not set, re-triggering an uevent makes relatively little sense to me. The likelihood that the next even will yield a different result is pretty close to zero. If if the first uevent is handled successfully but the second times out, udevd will delete the db file (see above), but no uevent will be sent to listeners. So multipathd will be ignorant of the lost / timed out event. There's nothing we can do about that. Regards Martin > > -Ben > > > Cheers, > > Martin > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel