The server is regularly updated to current Fedora packages. For the last month, or so, the server has failed to come up in a sane state, reliably. After it responds to pings, after ssh-ing in, and examining the aftermath, the logs of all network services are consistent, in that they claim that each network service – which includes: named-chroot, httpd, dhcpd, and privoxy – their boot logs claim that no network interfaces were up at the time they're started.
After finally getting pissed about having to manually re-brain the server, each time it boots, I attached a console monitor, and observed that the boot goes /very/ quickly, and the console login prompt comes up about 20-30 seconds before the server even starts responding to pings. Looks like the multi-user target is reached way long before networking even comes up.
Last week, I've commented on the following curiosity: after sifting through systemd's documentation, their documentation claims that "network.target" gets reached only after basic networking is up, and "network-online.target" gets reached only after all network interfaces are initialized.
Problem number one is that all servers specify "After=network.target", when, according to how I interpret this, they should all really specify "After=network-online.target".
After that, it came to my attention that there's a NetworkManager optional subpackage that installs a service that waits for network interfaces to come up, and it's specified as "Before=network.target network-online.target". It seems fairly obvious to me that it should really be "Before=network- online.target" and "After=network.target", with all other services that require a functioning network specifying "After=network-online.target". That made logical sense to me, but it seems that this confusing arrangement makes logical sense to someone else, so, whatever. I do not have NetworkManager installed, but, I figure, why not take a crack at whipping up a dirty hack that basically does the same thing?
But the unexpected result from the hack is that it seems to provide solid proof that systemd's dependency resolution is not working, but before I Bugzilla this (as little hope one might have from getting anything useful done by Bugzillaing this), I'd like to hear some consensus that I am interpreting the following data right. Who knows, I might actually have made a mistake, somewhere.
Let's take a look at what named-chroot.service says: [Unit] Description=Berkeley Internet Name Domain (DNS) Wants=nss-lookup.target Before=nss-lookup.target After=network.targetAre we all in agreement that named-chroot.service should only be started after network.target gets reached? Ok.
Now, here's my hack, which is basically a clone of that NetworkManager subpackage:
# cat /etc/systemd/system/wait-for-network.service [Unit] Description=Wait for network ports to be initialized Before=network.target network-online.target [Service] Type=oneshot ExecStart=/root/bin/wait-for-network [Install] WantedBy=multi-user.target Are we all in agreement that:1) This is a one-shot service, and according to systemd's documentation, systemd must wait until this script is complete, before it's considered started.
2) Until it's complete, network.target isn't reached.3) Therefore, this script must finish before systemd should start named- chroot.service
Yet, after testing this script, then activating it, the server still came up utterly brainless after the reboot. The results:
systemctl status named-chroot.service reports: named-chroot.service - Berkeley Internet Name Domain (DNS) Loaded: loaded (/usr/lib/systemd/system/named-chroot.service; enabled) Active: active (running) since Sat 2014-07-12 09:24:29 EDT; 3min 28s ago … So, systemd started named-chroot.service at 09:24:29.My script logs the current timestamp. The output from /root/bin/wait-for- network was as follows:
Sat Jul 12 09:24:27 2014 Interface: lo is up Sat Jul 12 09:24:32 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is down Sat Jul 12 09:24:37 2014 Interface: lan0 is up Interface: lo is up Interface: wan0 is upsystemd started this script at 09:24:27. This script spun its wheels until 09:24:37, at which time all network interfaces finally came up. I'm happy to post the contents of this short script; however I don't think that it's relevant here, because the problem is that this script was running when systemd decided to run named-chroot.service, even though, according to the above, this should not happen.
So, either I'm misreading the description of "oneshot" in systemd.service(5); and "Before" and "After" in systemd.unit(5), or systemd is broken completely. I think that my understanding of systemd's documentation is very reasonable. So, either systemd is broken, or, if it's supposedly working how it should be working, its documentation is crap, and is impossible to follow. I see no other possibilities.
Attachment:
pgp3Bpf5jle0W.pgp
Description: PGP signature
-- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org