When processing the incoming kernel event stream we typically enqueue events, with those events being fed to workers as they become free. The one exception is for events which indicate a timeliness requirement (TIMEOUT=N). Queueing these would violate this constraint and so they are started immediatly regardless of the number of workers. Typically events with timeliness requirements are firmware events. The common case here is for a device add event to pop out of the kernel during device discovery, udev responds by triggering a module load event. The module initialisation may then require firmware to initialise the card which triggers a firmware load event for the device. This firmware event pops out of the kernel into udevs queue. As this is blocking the modprobe the kernel indicates timeliness on the request, and this in turn allows the event to be processed immediatly. This convieniently avoids a potential deadlock should all of the workers be busy running module loads which required firmware. In exit (udevadm control --exit) processing we currently dump the entire incoming queue and ignore any further incoming kernel events. We then wait for the current workers to complete any events they were assigned, before finally exiting. However if any of the pending events should trigger a nested event that new event would not be processed preventing completion of the existing worker event. We will eventually timeout both leading to a long boot delay and in some cases to partially initialised cards. These cards often will not be repaired even during coldplug and are lost requiring manual intervention to recover. Modify exit processing such that we handle events with timeliness constraints on them, bringing them into line with normal processing. This allows events which are triggered from our existing workers events to run to completion and allow completion of those workers. This allows us to flush the queue and exit cleanly. BugLink: http://bugs.launchpad.net/bugs/842560 Signed-off-by: Andy Whitcroft <apw@xxxxxxxxxxxxx> --- udev/udevd.c | 39 ++++++++++++++++++++++++++++----------- 1 files changed, 28 insertions(+), 11 deletions(-) This bug is triggered on some machines we have with two bnx2 network interfaces each of which needs two firmware blobs loading. If we find the root disks in the middle we request udev exit so we can transition to the version in the real root. This request deadlocks resulting in a timeout. The firmware never gets loaded for the second card and manual intervention is required to recover the card. It should also be noted that this change does slightly exacerbate what I believe is an existing bug in the timeout handling. We essentially reset the pending timout to the maximum every time we handle any incoming event, for example when a worker completes. This does render the exact timeout somewhat hard to predict, before and after this change. diff --git a/udev/udevd.c b/udev/udevd.c index 05d4b2d..19f7128 100644 --- a/udev/udevd.c +++ b/udev/udevd.c @@ -584,7 +584,7 @@ static void event_queue_start(struct udev *udev) } } -static void event_queue_cleanup(struct udev *udev, enum event_state match_type) +static void __event_queue_cleanup(struct udev *udev, enum event_state match_type, bool keep_timely) { struct udev_list_node *loop, *tmp; @@ -594,10 +594,25 @@ static void event_queue_cleanup(struct udev *udev, enum event_state match_type) if (match_type != EVENT_UNDEF && match_type != event->state) continue; + /* Keep events which have timelyness requirements + * we will skew these timeouts on coldplug. */ + if (keep_timely && udev_device_get_timeout(event->dev) > 0) + continue; + event_queue_delete(event, false); } } +static void event_queue_cleanup(struct udev *udev, enum event_state match_type) +{ + __event_queue_cleanup(udev, match_type, false); +} + +static void event_queue_cleanup_nontimely(struct udev *udev, enum event_state match_type) +{ + __event_queue_cleanup(udev, match_type, true); +} + static void worker_returned(int fd_worker) { for (;;) { @@ -1580,24 +1595,21 @@ int main(int argc, char *argv[]) int i; if (udev_exit) { - /* close sources of new events and discard buffered events */ + /* close sources of new events and discard buffered events, + * keep the kernel channel so we can pick out events with a + * timeliness requirement to avoid deadlock. */ if (fd_ctrl >= 0) { epoll_ctl(fd_ep, EPOLL_CTL_DEL, fd_ctrl, NULL); fd_ctrl = -1; } - if (monitor != NULL) { - epoll_ctl(fd_ep, EPOLL_CTL_DEL, fd_netlink, NULL); - udev_monitor_unref(monitor); - monitor = NULL; - } if (fd_inotify >= 0) { epoll_ctl(fd_ep, EPOLL_CTL_DEL, fd_inotify, NULL); close(fd_inotify); fd_inotify = -1; } - /* discard queued events and kill workers */ - event_queue_cleanup(udev, EVENT_QUEUED); + /* discard queued non-timely events and kill workers */ + event_queue_cleanup_nontimely(udev, EVENT_QUEUED); worker_kill(udev, 0); /* exit after all has cleaned up */ @@ -1649,8 +1661,13 @@ int main(int argc, char *argv[]) dev = udev_monitor_receive_device(monitor); if (dev != NULL) { - udev_device_set_usec_initialized(dev, now_usec()); - if (event_queue_insert(dev) < 0) + /* If we are exiting then only schedule events with + * timeliness requirements. */ + if (!udev_exit || udev_device_get_timeout(dev) > 0) { + udev_device_set_usec_initialized(dev, now_usec()); + if (event_queue_insert(dev) < 0) + udev_device_unref(dev); + } else udev_device_unref(dev); } } -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html