Please either use reply to all or reply to list. Do not send personal replies to public list discussion. On Wed, Jul 12, 2023 at 12:01 PM LunarLambda <lunarlambda@xxxxxxxxx> wrote: > > What about the GatewayOnLink= inside the container? Isn't it meant for exactly this? Why does ip r ... onlink work but doing it via networkd doesn't? > Not sure. I tested it on openSUSE Tumbleweed with systemd 253.5 and it seems to work tumbleweed:/run/systemd/network # cat dummy.netdev [NetDev] Kind=dummy Name=dummy0 MACAddress=none tumbleweed:/run/systemd/network # cat dummy0.network [Match] Name=dummy0 [Network] Address=92.0.0.1/32 [Route] Gateway=37.0.0.2 GatewayOnLink=true tumbleweed:/run/systemd/network # ip a 5: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether be:3a:49:74:6b:7a brd ff:ff:ff:ff:ff:ff inet 92.0.0.1/32 scope global dummy0 valid_lft forever preferred_lft forever inet6 fe80::bc3a:49ff:fe74:6b7a/64 scope link valid_lft forever preferred_lft forever tumbleweed:/run/systemd/network # ip r default via 37.0.0.2 dev dummy0 proto static onlink tumbleweed:/run/systemd/network # > The old setup used > ip r add 91.x.x.x dev br0 src 37.x.x.x > > Via commands issued in /etc/network/interfaces. > > The old route looks like this: > 91.x.x.x dev br0 scope link src 37.x.x.x > > The route created by the network configuration looks like this: > 91.x.x.x dev br0 proto static > > Although I'm not sure this represents a meaningful difference. > > On Wed, 12 Jul 2023 at 10:29, Andrei Borzenkov <arvidjaar@xxxxxxxxx> wrote: >> >> On Wed, Jul 12, 2023 at 10:44 AM LunarLambda <lunarlambda@xxxxxxxxx> wrote: >> > >> > Hello, >> > >> > I was recently tasked with moving existing network configuration for a machine and some nspawn containers from iupdown to networkd. >> > >> > The situation looks as follows: >> > >> > A single VPS has 3 IPs. One 37.x.x.x/22, and two 91.x.x.x/32. The 37-ip is to be routed to the main server, whereas the two 91-ips should be routed directly to nspawn containers running on the server. The server uses systemd 247 and the container uses systemd 252, both Debian. >> > >> > I created a bridge netdev like so: >> > >> > [NetDev] >> > Name=br0 >> > Type=bridge >> > # Matches physical network card >> > MACAddress=AA:BB:CC:DD:EE:FF >> > >> > Bound the physical ethernet to it like so: >> > >> > [Match] >> > Name=ens3 >> > >> > [Network] >> > Bridge=br0 >> > >> > And set up the main IP for the bridge like so: >> > >> > [Match] >> > Name=br0 >> > >> > [Network] >> > DNS=... >> > DNS=... >> > Address=37.x.x.x/22 >> > Gateway=37.x.x.1 >> > >> > The nspawn containers are added to the bridge via >> > >> > [Network] >> > Bridge=br0 >> > >> > Up until this point everything works. However, configuring networking between the host and containers proved quite difficult and I'm unsure whether I'm doing something wrong or networkd is. >> > >> > What I tried was the following, inside the container: >> > >> > [Match] >> > Virtualization=container >> > Name=host0 >> > >> > [Address] >> > Address=91.x.x.x/32 >> > >> > [Route] >> > Gateway=37.x.x.x >> > GatewayOnLink=true >> > >> > However, this did not create any usable routes to the host, nor did it throw any errors in the journal. Pinging the host does not work. >> > >> > Manually creating the routes with ip route did work: >> > >> > ip r add 37.x.x.x dev host0 onlink >> > ip r add default dev host0 via 37.x.x.x >> > >> > I tried a variety of different combinations of options in the .network file, Scope, Type, etc... >> > >> > The only thing that successfully created any routes was the following: >> > >> > [Match] >> > Virtualization=container >> > Name=host0 >> > >> > [Address] >> > Address=91.x.x.x/32 >> > Peer=37.x.x.x/32 >> > >> > [Network] >> > Gateway=37.x.x.x >> > >> > This strikes me as odd because nowhere in the documentation, nor in any online searching could I find this described as necessary (beyond the manpage mentioning that Peer= exists) >> > >> >> How is your Linux container supposed to know that to reach host >> 37.x.x.x it needs to send a packet via interface with address >> 91.x.x.x? That is not how Linux routing normally works. You must have >> a routing entry that tells kernel how to forward packet and assigning >> address 91.x.x.x to your interface does not magically create any route >> entry to the network 37.x.x.x. Adding a peer address is one >> possibility which does it. Another possibility is creating the >> necessary routes manually like you did. >> >> > On the host side, I thought the bridge device, acting on Layer 2, would automatically figure out routes to the containers (via ARP), >> >> Bridge (physical or virtual) has nothing to do with routing, it is >> only using MAC addresses. ARP is used by the kernel to find out the L2 >> address for the destination L3 address which is on the broadcast >> network. It happens way after the routing decision was already made. >> So the kernel needs to know that network 37.x.x.x is directly >> reachable on the broadcast segment to which the interface is connected >> before the kernel even attempts ARP. That is exactly what your "ip r >> add 37.x.x.x dev host0 onlink" does. Alternative way is specifying a >> peer address which implicitly creates a similar routing entry (and >> peer can be the whole network). >> >> > or that nspawn and networkd would interact in some way to add routes. However, this didn't seem to happen, so I also had to add the following to the bridge's .network file: >> > >> > [Route] >> > Source=37.x.x.x >> > Destination=91.x.x.A >> > >> > [Route] >> > Source=37.x.x.x >> > Destination=91.x.x.B >> > >> >> Same as above. Host must know how to forward packets to the addresses >> 91.x.x.x and without routing entries nothing will tell the host how to >> do it. Routing is bidirectional; a container knowing how to forward >> traffic to the host does not automatically imply that the host knows >> how to forward traffic to the container. >> >> > With all of this, everything works fine now. However, since the routes, both host-to-container and container-to-host, differ somewhat from the old (also working) setup, >> >> Your working setup must have created the same routing entries because >> otherwise it would not work. Care to show your old configuration? >> >> > and some of the steps necessary I could not find described anywhere, I am left wondering if I fundamentally misunderstand something about how Linux networking works, or if perhaps networkd is behaving oddly because of the IP addresses being considered in different networks. >> >> You misunderstand how IP networking works. Nothing in your description >> is Linux specific. >> >> > >> > I would love to find a conclusive answer to this, especially because I want to make sure I understood the fundamental concepts involved correctly.