Ed Greshko writes:
On 07/05/14 20:13, Sam Varshavchik wrote:> So, how should this mess get fixed? Start filing bugs against all these packages, requesting a change to their systemd service file, to state a dependency on network-online.target?FWIW, I'm running a fully updated F20 system and not seeing any problems for httpd and named
Neither did I, until either the last, or the next to last, systemd update.
I also run with NetworkManager-wait-online.service enabled. There was a specific reason I started running with that enabled....don't remember why. But, you may want to check that.
The server with dhcp, httpd, named, and privoxy does not have NetworkManager installed. Both the WAN and the LAN ports are configured as static IPs.
The server with innd installed has NetworkManager, so I could theoretically enable it there.
http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ documents an alternative target, systemd-networkd-wait-online.service, which does not appear to actually exist anywhere, and is not installed by any package.
The more I dig into the config files, the bigger the clusterfark this appears to be.
The starting point is the above documentation for network.target and network- online.target. The above is supposed to be the authoritative documentation, directly referenced from the man pages. Starting with that, I look at what network-online.target actually says:
[Unit] Description=Network is Online Documentation=man:systemd.special(7) Documentation=http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget After=network.targetIt doesn't do anything, it's just a symbolic target. That's fine, so intent is that stuff that actually needs network connections should declare "After=network-online.target". Then, whatever system service is responsible for initializing the static network connections would declare both "After=network.target" and "Before=network-online.target", so it runs after basic networking is up. Once it succeeds in initializing the network connections, it terminates, network-online.target now gets reached, and all the services that depend on established network connections can now run. That seems to be the desired strategy.
Sounds great. This is actually not a such a bad plan of action. It might actually make sense, presuming that all servers that depend on established network connections would specify "After=network-online.target", and not "After=network.target", as they do now. Of course, as I discovered, only kdump.service actually does this. So, this is the first thing that goes off the rails. But the rest of the train quickly follows:
Now, given the initial design, one would automatically assume that NetworkManager-wait-online.service would follow the master plan, and specify "After=network.target" and "Before=network-online.target", putting all the jigsaw pieces in the correct order. But no, this is what NetworkManager-wait- online.service actually says:
[Unit] Description=Network Manager Wait Online Requisite=NetworkManager.service After=NetworkManager.service Wants=network.target Before=network.target network-online.targetIt specifies that it should be reached /before/ *both* network.target and network-online.target, rather than after network.target, and before network- online.target.
This really looks like somebody just said "eh, I'm just too lazy to fix all services that should really be executed after reaching network- online.target, I'm just going to fix this by executing NetworkManager-wait- online.service before network.target is reached, and before all the servers that currently require network.target get forked off".
Brilliant.So, enabling NetworkManager-wait-online.service is required on servers that run dhcp, named, httpd, and other servers. If it's not enabled, a roll of the dice will determine whether any of them will come up properly. And I'll bet none of these RPMs enable it, which is needed for this hack to work. And, if NetworkManager is not enabled, with all network interfaces being initialized to static IPs in /etc/sysconfig/network-scripts, I don't see a way to get this right. It may or may not work, depending on the order systemd chooses to execute scripts, and how long they take. Even the kernel version could be a factor – how long the kernel takes to initialize each network interface.
And the documented alternative, "systemd-networkd-wait-online.service", is still nowhere to be found. yum whatprovides comes up empty.
It should be fun watching all of this implode from the sidelines, as all servers running DHCP and httpd get updated to RHEL 7. Some of them will be fine. Some of them will randomly fail to come up fully. Those that do manage to work initially, at some point later a systemd update, or a kernel update, will subtly change the order in which stuff gets forked off from systemd, and suddenly break it.
Lots of fun.
Attachment:
pgp39rHprO8aG.pgp
Description: PGP signature
-- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org