On Mon, Jun 08, 2020 at 11:05:00PM -0400, Laine Stump wrote: > On 6/8/20 10:51 AM, Daniel P. Berrangé wrote: > > The virtual network has never supported NAT with IPv6 since this feature > > didn't exist at the time. NAT has been available since RHEL-7 vintage > > though, and it is desirable to be able to use it. > > > > This series enables it with > > > > <forward mode=3D"nat"> > > <nat ipv6=3D"yes"/> > > </forward> > > I've had this lurking on my "this is something I should do" list for a long > time, but couldn't decide on the best name in XML (and also figured that the > problem with accept_ra needed to be fixed first), so it never got to the > top. So I'm glad to see you've done it, disappointed in myself that I never > did it :-/ > > I like your XML knob naming better than what I'd considered. I had thought > of having <forward mode='supernat'> (or some other more reasonable extra > mode), but your proposal is more orthogonal and matches with the existing > ipv6='yes' at the toplevel of <network> (which is used to enable ipv6 > traffic between guests on the bridge even when there are no IPv6 addresses > configured for the network.) I considered mode="nat6" as an alternative, but it would have meant updating many switch() statements, and is a somewhat misleading as a name. > > </network> > > > > Conceptually this means > > > > - Try to gimme a subnet with IPv4 and DHCP > > - Try to gimme a subnet with IPv6 and RAs > > > > Now when we start the virtual network > > > > - If IPv4 is not enabled on host, don't assign addr > > What will we use to check for this? Not just "no IP addresses configured", I > guess, since it may be the case that libvirt has just happened to come up > before NM or whoever has started any networks. (or maybe someone wants to > use IPv6 on a libvirt virtual network, but have no IPv6 connectivity beyond > the host). IIUC, we can simply check whether it is possible to create a socket with AF_INET or AF_INET6. If the kernel supports it, then this should suceed, even if network manager isn't running yet. > > - Else > > - Iterate N=3D1..254 to find a free range for IPv4 > > - Use 192.168.N.0/24 for subnet > > - Use 192.168.N.1 for host IP > > - Use 192.168.N.2 -> 192.168.N.254 for guest DHCP > > > > - If IPv6 is not enabled on host, don't assign addr > > - Else > > - Generate NNNN:NNNN as 4 random bytes > > - Use fd00:add:f00d:NNNN:NNNN::0/64 for IPv6 subnet > > - Use fd00:add:f00d:NNNN:NNNN::1 for host IP > > - Use route advertizement for IPv6 zero-conf > > > > With NNNN:NNNN, even with 1000 guests running, we have just a 0.02% > > chance of clashing with a guest for IPv6. > > > > The "live" XML would always reflect the currently assigned addresses > > > > Proactively monitor the address allocations of the host. If we see > > a conflicting address appear, take down the dnsmasq intance, generate > > a new subnet, bring dnsmasq back online. > > Hmm. How would you see this monitoring happening? We couldn't do it with an > external script like I had done for simple "shut down on conflict" without > adding extra functionality to libvirt's network driver. We *could* go back > to the idea of monitoring netlink change messages ourselves within libvirtd > and doing it all internally ourselves. Or maybe the NM script I proposed > could go beyond simply destroying conflicting networks, and also restart any > network that had autoaddr='yes'; to make this fully functional we would need > to finally put in the proper stuff so that tap devices (and the underlying > emulated NICs) would be set offline when their connected network was > destroyed, and then reconnected/set online when the network was re-started. > Getting the networks to behave this way would be useful in general anyway, > even without thinking about the conflicting-networks problem. The one > downside of externally controlling renumbering-on-conflict using an external > script is that it would only work with NetworkManager... Yeah, I'm trying to remember now why we went the NM hook route, rather than listening for netlink events. I guess NM is much simpler to hook into. I'd honestly not thought about this too much though - just having an automatically numbered network will already be a huge step forward compared to current day. In particular if we insituted a rule that if we are NOT on a hypervisor, we count from N=254 -> 0, when picking 192.168.N.0, and count from N=0 -> 254 when we are on a hypervisor, then we'll trivially avoid the host/guest clash in simple case, even if network is not yet online. Don't anyone dare mention nested virt with 3 levels of libvirt... Seriously though, even without automatic teardown & restart, we'd be way better off by simply not hardcoding 192.168.N.0 at RPM install time when the network env is not the same as the run time network env. eg cloud images > > Ideally we would have to bring the guest network links offline and > > then online again to force DHCP re-assignment immediately. > > Yeah, I think it really makes sense that when a libvirt network is > destroyed, all the tap devices are set offline, and the emulated NICs are > set offline as well; then when a libvirt network is started, we would go > through all devices that are supposed to be connected to that network, > reconnect the taps, set them online, and set the emulated NIC online. We > currently do the reconnection part when libvirtd is restarted but can't do > it immediately when a *network* is restarted because the network driver has > no access to the list of active guests and their interfaces.... > > Hmm, we do now maintain the list of ports for each network though, and it > would be possible to expand that to keep the name of the tap device > associated with the port in addition to the other info (e.g. whether or not > the NIC has been set offline via an API call), *but* when a network is > destroyed, all ports registered with that network are also destroyed, so > just expanding the attributes for the ports isn't going to get us where we > need. So, do we want to 1) change it to maintain active ports for a network > when it is destroyed so that they can be easily reactivated when the network > is restarted? Or do we want to 2) change the network driver to make calls to > all registered hypervisor drivers during a net-start to look for all guest > interfaces that think they are connected to the network? The former sounds > much more efficient, but I don't know how "dirty" it seems to maintain state > for something that has been "destroyed"... > > Or maybe we instead need to also add a new API for networks > virNetworkReconnect(), which will use newly expanded info in the network > ports list to reconnect all guest interfaces. Responsibility for enslaving a TAP device into a bridge still lives with the virt drivers, not the network driver. The virt drivers could listen for lifecycle events from the network driver and auto-reconnect. Alternatively the virt driver could listen for netlink events and see the virbr0 being deleted, and created by the kernel. > On a different sub-topic - it would be nice to provide some stability to the > subnet used for an autoaddr='yes' network (think of the case where every > time a host is booted, libvirt starts its default network when > 192.168.122.0/24 is available, but then a short time later a host interface > is always started on the same subnet - that would mean every time the host > booted the exact same destabilizing dance would take place even though it > would be pretty easy to predict the eventually-used subnet based on past > experience). > > Although we historically have avoided automatic changes to libvirt config > files by libvirtd itself as much as possible (the only cases I can think of > are when we're modifying the config to take care of some compatibility > problem after an upgrade), what do you think about having the autoaddr='yes' > networks automatically update the config with the current subnet info? > (maybe this would need to only be done if not starting from a live image or > something, or maybe it should just always be done). This would then be used > as the first guess the next time the network was started. That way we would > avoid the need to delay starting libvirt networks until after host > networking was fully up; the subnet might bounce around a bit that first > time, but once a stable address was found during that first run, it would > then be used from the get-go during all subsequent boots (until/unless > something changed and it had to be changed yet again). We could stash the previously chosen subnet in /var/cache/libvirt/network or /var/lib/libvirt/network, no need to modify the inactive XML config. This is like how dnsmasq "remembers" DHCP leases previously given for guests. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|