Adding Paul Moore since he is the labelled networking expert. On 08/20/2014 11:17 AM, Daniel P. Berrange wrote: > As everyone knows sVirt is our nice solution to isolating guest resources > from other (malicious) guests through SELinux labelling of the appropriate > files / device nodes. This has been pretty effective since we introduced > it to libvirt. > > In the last year or two, particularly in the cloud arena, there has been > a big shift towards use of network based storage. Initially we were relying > on kernel drivers / FUSE layers that exposed this network storage as devices > or nodes in the host filesystem, so sVirt still stood a chance of being > useful if the devices /FUSE layer supported labelling. > > Now though QEMU has native support for talking to gluster, ceph/rbd, > iscsi and even nfs servers. This support is increasingly used in preference > to using the kernel drivers / FUSE layers since it provides a simpler and > thus (in theory) better performing I/O path for the network storage and > does not require any privileged setup tasks on the host ahead of time. > > The problem is that I beleive this is blowing a decent sized hole in our > sVirt isolation story. > > eg when we launch QEMU with an argument like this: > > -drive 'file=rbd:pool/image:auth_supported=none:\ > mon_host=mon1.example.org\:6321\;mon2.example.org\:6322\;\ > mon3.example.org\:6322,if=virtio,format=raw' > > We are trusting QEMU to only ever access the disk volume 'pool/image'. > There are, in all likelihood, many 100's or 1000's of disk images on the > server it is connecting to and nothing is stopping QEMU from accessing > any of them AFAICT. > > There is no currrently implemented mechanism by which the sVirt label > that QEMU runs under is made available to the remote RBD server to use > for enforcement, nor any way in which libvirt could tell the RBD server > which label was applied for which disk. The same seems to apply for > Gluster, iSCSI, and NFS too when accessed directly from a network client > inside the QEMU process. > > As it stands the only approach I see for isolating each virtual machines > disk(s) from other virtual machines is to make use of user authentication > with these services. eg each virtual machine would need to have its own > dedicated user account on the RBD/Gluster/iSCSI/NFS server, and the disk > volumes for the VM would have to be made accessible solely to that user > account. Assuming such user account / disk mapping exists in the servers > today that can be made to work but it is an incredibly awful solution > to deal with when VMs are being dynamically created & deleted very > frequently. > > Today apps like OpenStack just have a single RBD username and password > for everything they do. Any virtual machines running with RBD storage > on OpenStack thus have no sVirt protection for their disk images AFAICT. > To protect images OpenStack would have to dynamically create & delete > new user accounts on the RBD server & setup disk access for them. I > don't see that kind of approach being viable. > > IIUC, there is some mechanism at the IP stack level where the kernel > can take the SELinux label of the process that establishes the network > connection and pass it across to the server. If there was a way in the > RBD API for libvirt to label the volumes, then potentially we could > have a system where the RBD server did sVirt enforcement, based on the > instructions from libvirt & the label of the client process. > > Thoughts on what to do about this ? Network based storage, where the > network client is inside each QEMU server, is here to stay so I don't > think we can ignore the problem long term. > > Regards, > Daniel -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list