On 2023-08-06 03:42, Ross Boylan wrote:
On Fri, Aug 4, 2023 at 4:32 PM Kevin P. Fleming
<lists.systemd-devel@xxxxxxxxxxxxx> wrote:
On Fri, Aug 4, 2023, at 18:11, Ross Boylan wrote:
Theory: since br0 has no associated IP address when socket creation is
attempted, the socket creation fails. If so, I need to delay socket
startup until br0 has an IP4 address, but I'm not sure how to do
that--or even if that is the problem.
This is almost certainly the cause, and the reason that the 'FreeBind' parameter can be set in .socket files :-)
Thank you, Kevin. Setting FreeBind=yes results in successful socket
activation on system startup
I still find the description of FreeBind on the man page puzzling: "
Controls whether the socket can be bound to non-local IP addresses."
But 192.168.1.10 is a local IP address, and for that matter one can
only directly create sockets on the local machine. The rest of the
description makes clear the option is for my case, but I don't see how
that relates to the quoted sentence. Presumably the problem is the
meaning of "non-local IP addresses". Can anyone explain?
"Local" in the same sense as 'localhost'.
If the IP address is configured on any of the machine's interfaces, then
it is a "local" address -- and if it's not assigned on an interface
(yet), then it's non-local.
At the time .socket units start, in most cases, the network interfaces
are likely to not have any IP addresses set up yet (it's even somewhat
deliberate that .socket units start before services), so 192.168.1.10 is
not yet "local" at that point in time, and sockets cannot bind() to it
yet -- you get "Cannot assign requested address" as the error message.
(This applies equally to systemd .socket units as to listening sockets
that daemons might set up directly.)
The "free bind" option bypasses this restriction; it makes bind() calls
always succeed, although the socket still doesn't actually begin
receiving packets until later (when that IP address gets configured on a
local interface).
It is possible to delay .socket startup until after an interface is
configured (probably by ordering it *after* network-online.target,
whereas your current version has an implicit 'before' instead), but
it'll be easier to enable FreeBind=.
Finally, if the machine only has one IP address, it's even easier to not
bother with binding to a specific address at all -- instead specify
"ListenStream=14987" to make it bind to the wildcard 0.0.0.0 and [::]
addresses instead. Such a socket will automatically listen on any
current *and future* IP addresses assigned to the machine.
Did I only run into this problem because I specified a BindToDevice
directive? It seemed like a good idea since there are potentially 2
interfaces the socket could attach to, either the virtual interface
br0 or the actual physical network interface that requests come in on.
No, but the directive is not really useful here. Sockets do not "attach"
to interfaces in the way you probably imagine -- they primarily attach
to IP addresses and let the system's IP stack handle everything else.
(That is, sockets *do not* directly grab packets from an interface; the
network stack does that globally.)
The message
systemd[1]: Listening on Socket to tickle to update family netboot config.
still occurs interspersed with kernel messages from ~2s after boot,
before IP addresses are configured.
Ross
Current config:
# /etc/systemd/system/family.socket
[Unit]
Description=Socket to tickle to update family netboot config
[Install]
WantedBy=network-online.target
[Socket]
ListenStream=192.168.1.10:14987
# want to run a new job, aka service, for each connection.
Accept=Yes
BindToDevice=br0
# must wait until it has an IP address
FreeBind=true
# 2s is default
TriggerLimitIntervalSec=5s