Kay Sievers wrote:
On Fri, 2009-05-29 at 20:11 +0200, Kay Sievers wrote:
On Fri, May 29, 2009 at 19:53, Alan Jenkins <alan-jenkins@xxxxxxxxxxxxxx> wrote:
Maybe just recycling the event processes would bring
similar gains, with less of the risks of threads.
Yeah, I thought that too, without having tested anything, it could be,
that we just want to keep a pipe to the event process, and let the
event process send a signal back to the main daemon, that it has
handled the event, and it goes to sleep after that. The main daemon
can recycle a sleeping event process and push a new event over the
pipe to it. If no events are queued anymore, the main daemon just
closes the pipe, and the event process will exit.
With that model we might be able to reduce the number of fork()s
significantly. And we would still have the process separation, it's
robustness, and the lock-free behavior for malloc, cloexec and all
these issues.
Here is a rough hack to check how it behaves. It boots my box, nothing
else I really checked.
It clones the event processes as the current version does, but the event
process stays around to get re-used as a worker for later events.
Further messages are send over netlink to the worker processes, and the
worker process signals its state back with sigqueue() rt-signals. When
the events have settled, the workers get killed after a few seconds of
idle time.
The current git version:
$ time (udevadm trigger; udevadm settle)
real 0m1.566s
...
$ time /sbin/udevd.orig
[]
user 0m0.420s
sys 0m1.412s
The thread-version:
$ time (udevadm trigger -Snet; udevadm settle)
real 0m1.336s
$ time udev/udevd
[]
user 0m0.310s
sys 0m0.679s
The worker-version:
$ time (udevadm trigger; udevadm settle)
real 0m1.171s
...
$ time udev/udevd
[]
user 0m0.057s
sys 0m0.095s
The thread- and worker-versions do not create as many COW page-faults in
the daemon after every cloned event-process, and therefore need much
less CPU.
At least on the dual-core laptop here, the pool of workers seems to be
faster than the threads.
Great, you beat me to it. It makes sense that this would be _bit_
faster than threads. As I say I tried to avoid the page faults caused
(somehow) by forking the extra thread-stack mappings, but it didn't
really work out. I'm surprised by quite how much faster it is though!
I have some thoughts which may help in bringing the code up from "rough
hack" quality :-).
Aren't signal queues unreliable? If you exceed the maximum number of
queued signals, sigqueue will fail with EAGAIN, and I don't think
there's a blocking version :-). I think a pipe would be better, or
maybe you can do something with netlink.
+ /* wait for more device messages from udevd */
+ do
+ dev = udev_monitor_receive_device(monitor);
+ while (dev == NULL);
I think this loop should finish when the signal handler sets
worker_exit? But maybe you didn't actually install a handler for
SIGTERM and it's still being reset to the default action:
/* set signal handlers */
memset(&act, 0x00, sizeof(act));
act.sa_handler = event_sig_handler;
@@ -154,66 +203,135 @@ static void event_fork(struct udev_event *event)
sigaction(SIGTERM, &act, NULL);
I'm not sure you're handling worker processes crashing, it looks like
you might leave the event in limbo if you get SIGCHLD for a worker in
state WORKER_RUNNING. I'm sure you'll test that though.
Talking of unexpected crashes. I anticipated using a
connection-oriented socketpair for passing events. Using netlink for
this means the worker threads don't get a nice notification if the
daemon dies without killing them. Unless this is covered by udevd being
the session leader, or something like that? I'll test this empirically
- maybe it's not important, but I think it's good to know what will happen.
BTW I had a go at this too, but a couple of my workers fail in about 1
run out of 20, apparently because calloc() returns NULL without setting
errno. I'm sure it's my fault but I'll try to track it down. Obviously
I'll let you know if your patch could be affected by the same problem.
Thanks
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-hotplug" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html