On Mon, Aug 11, 2008 at 12:17:48PM +1000, James Morris wrote: > 4. Design Considerations > > 4.1 Consensus in preliminary discussion appears to be that adding > MAC to libvirt will be the most effective approach. Support > may then be extended to virsh, virt-manager, oVirt etc. I can see a couple of immediate items to address in the libvirt space - Need to decide how to ensure the VM is run with the correct security label instead of the default virt_t. Cannot assume that all VMs have disks configured. Some VMs may be PXE boot, and use an NFS/iSCSI root filesystem - this is not visible to the host. Implication is that we can't rely on labelling of disks files to infer the VM's security context. This suggests the domain XML format needs to allow for a security context to be specified at time the VM is defined/created in libvirt. libvirt would have to takes steps to make sure the VM is started with this defined context. An approach of including context in the XML would also allow easy extension to Xen XSM framework in future where you specify a context at time of VM creation, which is passed to the hypervisor. - The storage XML format can already report what label a storage volume currently has. In addition we need to be able to set the label. A few complications... - We may need to set it in several places - ie a VM may be assigned a disk based on a stable path such as /dev/disk/by-uuid/4cb23887-0d02-4e4c-bc95-7599c85afc1a Which is a symlink to the real (unstable) device name /dev/sda1 Clearly need to set label on the real device, but may also ned to change the symlink too ? - We can't add the new label to the SELinux policy directly, because the label needs to be on the unstable device name /dev/sdaXXX which can change across host OS reboots. Do we instead add the info the udev rules, so when /dev is populated at boot time by udev the device nodes get the desired initial labelling ? Or do we manually chcon() the device at the time we boot the VM ? - Some storage types don't allow per-file labelling - eg NFS In those scenarios the storage pool is assigned a label and all volumes inherit it. So, if two VMs are using NFS files and need different labelling, they need to use different directories on the NFS server, so that we can have separate mount points with appropriate labelling for each. > 4.2 Initially, sVirt should "just work" as a means to isolate VMs, > with minimal administrative interaction. e.g. an option is > added to virt-manager which allows a VM to be designated as > "isolated", and from then on, it is automatically run in a > separate security context, with policy etc. being generated > and managed by libvirt. > > 4.3 We need to consider very carefully exactly how VMs will be > launched and controlled e.g. overall security ultimately must > be enforced by the host kernel, even though VM launch will be > initially controlled by host userspace. > > 4.4 We need to consider overall management of labeling both > locally and in distributed environments (e.g. oVirt), as well > as situations where VMs may be migrated between systems, > backed up etc. We need to define who/what is responsible for ensuring that all hosts in the cluster have the same policy loaded. Typically libvirt only aims to provide the mechanism, and not constrain what you do with it. So perhaps libvirt needs to merely be able to report on what policy version is loaded as part of host capabilities information. oVirt (or FreeIPA?) would be responsible for using this info, and also ensuring that all hosts have same policy if desired/required. > One possible approach may be to represent the security label > as the UUID of the guest and then translate that to locally > meaningful values as needed. This implies there needs to be some lookup table of UUID -> security label mappings on every host in the cluster. This needs to be updated whenever a new VM is created, which is a fairly significant data sync task someone/thing needs to take care of. Would be doable for oVirt or FreeIPA, since they have a network-wide view. virt-manager though has individual host-centric view of things - it doesn't consider the broader picture. > 4.5 MAC controls/policy will need to be considered for any control > planes (e.g. /dev/kvm). I should probably point out that there are in fact two ways in which KVM/QEMU can be used on a host - The 'system' instance. There is one of these per host, and it currently runs as a privileged user (ie root) - The 'session' instance. There is one of these per user, per host and it runs as the unprivileged user. The session instances can only utilize KVM acceleration if the host admin has given then appropriate group/ACL membership to access /dev/kvm. Likewise they can only access physical devices if they have neccessary grou/ACL membership for the device. Network access is SLIRP based unless the admin has pre-created TUN devices & given them access. I imagine that for this work we'll primarily target the 'system' instance and anything that happens to work for the 'session' instances can just be considered a free bonus > 4.10 {lib}semanage needs performance optimization work to reduce > impact on the virt toolchain. Specifically in libvirt we need to avoid a dependancy on python. For oVirt we have a requirement that the operating system for a 'managed node' (ie the host running VMs) can be built into a Live CD / PXE bootable image that is < 64 MB in size. So any new dependancies from libvirt are very sensitive in terms of on disk footprint. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list