On Wed, Aug 8, 2012 at 3:54 PM, Corey Bryant <coreyb@xxxxxxxxxxxxxxxxxx> wrote: > > > On 08/08/2012 09:04 AM, Stefan Hajnoczi wrote: >> >> On Tue, Aug 7, 2012 at 4:58 PM, Corey Bryant <coreyb@xxxxxxxxxxxxxxxxxx> >> wrote: >>> >>> libvirt's sVirt security driver provides SELinux MAC isolation for >>> Qemu guest processes and their corresponding image files. In other >>> words, sVirt uses SELinux to prevent a QEMU process from opening >>> files that do not belong to it. >>> >>> sVirt provides this support by labeling guests and resources with >>> security labels that are stored in file system extended attributes. >>> Some file systems, such as NFS, do not support the extended >>> attribute security namespace, and therefore cannot support sVirt >>> isolation. >>> >>> A solution to this problem is to provide fd passing support, where >>> libvirt opens files and passes file descriptors to QEMU. This, >>> along with SELinux policy to prevent QEMU from opening files, can >>> provide image file isolation for NFS files stored on the same NFS >>> mount. >>> >>> This patch series adds the add-fd, remove-fd, and query-fdsets >>> QMP monitor commands, which allow file descriptors to be passed >>> via SCM_RIGHTS, and assigned to specified fd sets. This allows >>> fd sets to be created per file with fds having, for example, >>> different access rights. When QEMU needs to reopen a file with >>> different access rights, it can search for a matching fd in the >>> fd set. Fd sets also allow for easy tracking of fds per file, >>> helping to prevent fd leaks. >>> >>> Support is also added to the block layer to allow QEMU to dup an >>> fd from an fdset when the filename is of the /dev/fdset/nnn format, >>> where nnn is the fd set ID. >>> >>> No new SELinux policy is required to prevent open of NFS files >>> (files with type nfs_t). The virt_use_nfs boolean type simply >>> needs to be set to false, and open will be prevented (and dup will >>> be allowed). For example: >>> >>> # setsebool virt_use_nfs 0 >>> # getsebool virt_use_nfs >>> virt_use_nfs --> off >>> >>> Corey Bryant (6): >>> qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg >>> qapi: Introduce add-fd, remove-fd, query-fdsets >>> monitor: Clean up fd sets on monitor disconnect >>> block: Convert open calls to qemu_open >>> block: Convert close calls to qemu_close >>> block: Enable qemu_open/close to work with fd sets >>> >>> block/raw-posix.c | 42 ++++----- >>> block/raw-win32.c | 6 +- >>> block/vdi.c | 5 +- >>> block/vmdk.c | 25 +++-- >>> block/vpc.c | 4 +- >>> block/vvfat.c | 16 ++-- >>> cutils.c | 5 + >>> monitor.c | 273 >>> +++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> monitor.h | 5 + >>> osdep.c | 117 +++++++++++++++++++++++ >>> qapi-schema.json | 110 +++++++++++++++++++++ >>> qemu-char.c | 12 ++- >>> qemu-common.h | 2 + >>> qemu-tool.c | 20 ++++ >>> qerror.c | 4 + >>> qerror.h | 3 + >>> qmp-commands.hx | 131 +++++++++++++++++++++++++ >>> savevm.c | 4 +- >>> 18 files changed, 730 insertions(+), 54 deletions(-) >> >> >> Are there tests for this feature? Do you have test scripts used >> during development? > > > Yes I have some C code that I've been using for testing. I can clean it up > and provide it if you'd like. That would be very useful. tests/ has test cases. For the block layer tests/qemu-iotests/ is especially relevant, that's where a lot of the test cases go. If you look at test case 030 you'll see how a Python script interacts with QMP to test image streaming - unfortunately I think Python doesn't natively support SCM_RIGHTS. But a test script would be very useful so it can be used as a regression test in the future. >> >> Here's what I've gathered: >> >> Applications use add-fd to add file descriptors to fd sets. An fd set >> contains one or more file descriptors, each with different access >> modes (O_RDONLY, O_RDWR, O_WRONLY). File descriptors can be retrieved >> from the fd set and are matched by their access modes. This allows >> QEMU to reopen files with different access modes. >> >> File descriptors stay in their fd set until explicitly removed by the >> remove-fd command or when all monitor clients have disconnected. This >> ensures that file descriptors are not leaked after a monitor client >> crashes. Automatic removal on monitor close is postponed until all >> duped fds have been fd - this means QEMU can still reopen an in-use fd > > > I assume you mean "... until all duped fds have been *closed* - ..." Yes, my typo :) >> after a client disconnects. >> >> Does this sound right? > > > Yes, exactly. > > I should point out there is an issue that needs to be cleaned up in the > future. There are short windows of time where refcount can get to zero > while an image file is in use. This is because the file is being reopened. > For example, I've noticed this occurs when format= is not specified on the > device_add command and the file is probed, and when mouting/unmounting a > file system. Hopefully this can be treated as a follow-up issue. The block layer doesn't treat this as a "reopen" today. Supriya Kannery has a patch series for bdrv_reopen() which would also need to be integrated with fd sets to ensure the refcount doesn't hit 0 and cause a cleanup. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list