On Tue, 2007-01-16 at 22:28 +0000, Daniel P. Berrange wrote: > On Mon, Jan 15, 2007 at 08:06:18PM +0000, Mark McLoughlin wrote: > Since we've disappeared down a rat-hole with the other part of the thread, > here's an attempt to get back on-topic :-) Indeed :-) > Since the user is privileged, another way to do without VDE is to mirror > the Xen case almost exactly, creating one tap device per guest, instead > of Xen's netback vif devices: Sure. There is the argument that always using VDE is nicer because it's consistent with the non-privileged and remotely connected network versions. As you say, though, this way is consistent with the Xen version. > > 3. An unprivileged user does exactly the same thing as (2). > > > > +-----------+ +-----------+ > > | Guest | +----+----+ | Guest | > > | A | |userspace| | B | > > | +---+ | | network | | +---+ | > > | |NIC| | | stack | | |NIC| | > > +---+-+-+---+ +----+----+ +---+-+-+---+ > > ^ +-------+ | +-------+ ^ > > | | | +---+---+ | | | > > +------>+ VLAN0 +-+ VDE +-+ VLAN0 +<------+ > > | | +-------+ | | > > +-------+ +-------+ > > > > Notes: > > > > * Similar to (2) except there is can be no TAP device or > > bridge > > * The userspace network stack is implemented using > > slirpvde to provide a DHCP server and DNS proxy to the > > network, but also effectively a SNAT and DNAT router. > > * slirpvde implements ethernet, ip, tcp, udp, icmp, dhcp, > > tftp (etc.) in userspace. Completely crazy, but since > > the kernel apparently has no secure way to allow > > unprivileged users to leverage the kernel's network > > stack for this, then it must be done in userspace. > > Is it practical to just have some kind of privileged proxy that would > merely create & configure the tap devices on behalf of the unprivileged > guests ? If we just create tap devices for any unprivileged guest, but > kept them discounted from any real network device, would that still be > a big hole ? Okay, to avoid a userspace network stack, you need a way to securely allow guests running as unprivileged users to use the kernel's network stack. That implies: 1) The packets/frames have to arrive on a network interface created by the user (e.g. a TAP or SLIP iface) 2) It should not be possible to spoof as another host or adversely affect the host's connectivity, or any other machine on the same network as the host 3) slirp prevents spoofing by effectively translating the source address of any packet which leaves the virtual network, just like a router using SNAT 4) We can do the same thing by enabling IP forwarding and having all packets forwarded by the host go through SNAT 5) The problem with that is what to do about packets not being forwarded by the host, but which are destined for the host itself? SNAT in PREROUTING might do it, but that's not allowed it seems. 6) We also have to worry about whether people could e.g. screw up the host's ARP cache 7) We also have to worry about a DOS whereby someone creates lots of network interfaces And note, this isn't just about worrying about nasty guests. You have to worry about what nasty users on the host could do with a setuid helper like this. It's certainly got to be "possible" ... but I don't yet feel I know what all the bases are that need to be covered, never mind how we'd cover them. > Or can we leverage QEMU's builtin SLIRP or other non-TAP networking modes > to construct something reasonable in userspace, without using VDE. The general problem with any SLIRP derivative or similar it's another network stack implementation. That makes me nervous for security, performance, stability and portability reasons. And as I found out, the case in point is that SLIRP currently has buffer overflow vulnerabilities and isn't 64 bit clean. > > Virtual Networks will be implemented in libvirt. First, there will be an > > XML description of Virtual Networks e.g.: > > > > <network id="0"> > > <name>Foo</name> > > <uuid>596a5d2171f48fb2e068e2386a5c413e</uuid> > > <listen address="172.31.0.5" port="1234" /> > > <connections> > > <connection address="172.31.0.6" port="4321" /> > > </conections> > > <dhcp enabled="true"> > > <ip address="10.0.0.1" > > netmask="255.255.255.0" > > start="10.0.0.128" > > end="10.0.0.254" /> > > </dhcp> > > <forwarding enabled="true"> > > <incoming default="deny"> > > <allow port="123" domain="foobar" destport="321" /> > > </incoming> > > <outgoing default="allow"> > > <deny port="25" /> > > </outgoing> > > </forwarding> > > <network> > > Got to also think how we connect guest domains to the virtual network. Right, further on in the mail I said: * Where is the connection between domains and networks in either the API or the XML format? How is a domain associated with a network? You put a bridge name in the <network> definition and use that in the domains <interface> definition? Or you put the network name in the interface definition and have libvirt look up the bridge name when creating the guest? > Currently we just have something really simple like > > <interface type="bridge"> > <source bridge='xenbr0'/> > <mac address='00:11:22:33:44:55'/> > </interface> > > I guess we've probably want to refer to the UUID of the network to map > it into the guest. Well, the UUID isn't much good if you can't map it. So, it would probably be the name and libvirt URI, right? > Oh, do we to define a 'network 0' to the the physical network of the hos > machine - what if there are multiple host NICs - any conventions we > need to let us distinguish ? Maybe its best to just refer to the host > network by using IP addresses - so we can deal better which case where > a machine switches from eth0 -> eth1 (wired to wireless) but keeps the > same IP address, or some such. Well, I think there should be a default virtual network defined somehow. You shouldn't need to create one unless you want a second one. But remember that under the model I'm suggesting, guests connect *either* to a virtual network or a physical network via a "shared physical interface". The shared physical interface just winds up being a bridge you enslave the guest's interface to, so the easiest answer for that is that we stick with the way it is right now for Xen and have QEMU create a TAP device and enslave that to the bridge in this mode. Dunno, it does need more thought/discussion ... I find the current <interface> stuff quite strange now - e.g. "bridge" vs. "ethernet" types and the bridge name is in <source> ? Cheers, Mark.