On Tue, Jun 14, 2011 at 4:31 PM, Corey Bryant <bryntcor@xxxxxxxxxx> wrote: > sVirt provides SELinux MAC isolation for Qemu guest processes and their > corresponding resources (image files). sVirt provides this support > by labeling guests and resources with security labels that are stored > in file system extended attributes. Some file systems, such as NFS, do > not support the extended attribute security namespace, which is needed > for image file isolation when using the sVirt SELinux security driver > in libvirt. > > The proposed solution entails a combination of Qemu, libvirt, and > SELinux patches that work together to isolate multiple guests' images > when they're stored in the same NFS mount. This results in an > environment where sVirt isolation and NFS image file isolation can both > be provided. > > This patch contains the Qemu code to support this solution. I would > like to solicit input from the libvirt community prior to starting > the libvirt patch. > > Currently, Qemu opens an image file in addition to performing the > necessary read and write operations. The proposed solution will move > the open out of Qemu and into libvirt. Once libvirt opens an image > file for the guest, it will pass the file descriptor to Qemu via a > new fd: protocol. > > If the image file resides in an NFS mount, the following SELinux policy > changes will provide image isolation: > > Â- A new SELinux boolean is created (e.g. virt_read_write_nfs) to >  Âallow Qemu (svirt_t) to only have SELinux read and write >  Âpermissions on nfs_t files > > Â- Qemu (svirt_t) also gets SELinux use permissions on libvirt >  Â(virtd_t) file descriptors > > Following is a sample invocation of Qemu using the fd: protocol on > the command line: > >  Âqemu -drive file=fd:4,format=qcow2 > > The fd: protocol is also supported by the drive_add monitor command. > This requires that the specified file descriptor is passed to the > monitor alongside a prior getfd monitor command. > > There are some additional features provided by certain image types > where Qemu reopens the image file. All of these scenarios will be > unsupported for the fd: protocol, at least for this patch: > > Â- The -snapshot command line option > Â- The savevm monitor command > Â- The snapshot_blkdev monitor command > Â- Starting Qemu with a backing file There's also native CDROM device. Did you consider adding an explicit reopen method to block layer? > The thought is that this support can be added in the future, but is > not required for the initial fd: support. > > This patch was tested with the following formats: raw, cow, qcow, > qcow2, qed, and vmdk, using the fd: protocol from the command line > and the monitor. Tests were also run to verify existing file name > support and qemu-img were not regressed. Non-valid file descriptors, > fd: without format, snapshot and backing files were also tested. > > Signed-off-by: Corey Bryant <coreyb@xxxxxxxxxxxxxxxxxx> > --- > Âblock.c      |  16 ++++++++++ > Âblock.h      |  Â1 + > Âblock/cow.c    |  Â5 +++ > Âblock/qcow.c   Â|  Â5 +++ > Âblock/qcow2.c   |  Â5 +++ > Âblock/qed.c    |  Â4 ++ > Âblock/raw-posix.c |  81 +++++++++++++++++++++++++++++++++++++++++++++++------ > Âblock/vmdk.c   Â|  Â5 +++ > Âblock_int.h    |  Â1 + > Âblockdev.c    Â|  10 ++++++ > Âmonitor.c     |  Â5 +++ > Âmonitor.h     |  Â1 + > Âqemu-options.hx  |  Â8 +++-- > Âqemu-tool.c    |  Â5 +++ > Â14 files changed, 140 insertions(+), 12 deletions(-) > > diff --git a/block.c b/block.c > index 24a25d5..500db84 100644 > --- a/block.c > +++ b/block.c > @@ -536,6 +536,10 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, >     char tmp_filename[PATH_MAX]; >     char backing_filename[PATH_MAX]; > > +    Âif (bdrv_is_fd_protocol(bs)) { > +      Âreturn -ENOTSUP; > +    Â} > + >     /* if snapshot, we create a temporary backing file and open it >      Âinstead of opening 'filename' directly */ > > @@ -585,6 +589,10 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, > >   /* Find the right image format driver */ >   if (!drv) { > +    Â/* format must be specified for fd: protocol */ > +    Âif (bdrv_is_fd_protocol(bs)) { > +      Âreturn -ENOTSUP; > +    Â} >     ret = find_image_format(filename, &drv); >   } > > @@ -1460,6 +1468,11 @@ int bdrv_enable_write_cache(BlockDriverState *bs) >   return bs->enable_write_cache; > Â} > > +int bdrv_is_fd_protocol(BlockDriverState *bs) > +{ > +  Âreturn bs->fd_protocol; > +} > + > Â/* XXX: no longer used */ > Âvoid bdrv_set_change_cb(BlockDriverState *bs, >             void (*change_cb)(void *opaque, int reason), > @@ -1964,6 +1977,9 @@ int bdrv_snapshot_create(BlockDriverState *bs, >   BlockDriver *drv = bs->drv; >   if (!drv) >     return -ENOMEDIUM; > +  Âif (bdrv_is_fd_protocol(bs)) { > +    Âreturn -ENOTSUP; > +  Â} >   if (drv->bdrv_snapshot_create) >     return drv->bdrv_snapshot_create(bs, sn_info); >   if (bs->file) > diff --git a/block.h b/block.h > index da7d39c..dd46d52 100644 > --- a/block.h > +++ b/block.h > @@ -182,6 +182,7 @@ int bdrv_is_removable(BlockDriverState *bs); > Âint bdrv_is_read_only(BlockDriverState *bs); > Âint bdrv_is_sg(BlockDriverState *bs); > Âint bdrv_enable_write_cache(BlockDriverState *bs); > +int bdrv_is_fd_protocol(BlockDriverState *bs); > Âint bdrv_is_inserted(BlockDriverState *bs); > Âint bdrv_media_changed(BlockDriverState *bs); > Âint bdrv_is_locked(BlockDriverState *bs); > diff --git a/block/cow.c b/block/cow.c > index 4cf543c..e17f8e7 100644 > --- a/block/cow.c > +++ b/block/cow.c > @@ -82,6 +82,11 @@ static int cow_open(BlockDriverState *bs, int flags) >   pstrcpy(bs->backing_file, sizeof(bs->backing_file), >       cow_header.backing_file); > > +  Âif (bs->backing_file[0] != '\0' && bdrv_is_fd_protocol(bs)) { > +    Â/* backing file currently not supported by fd: protocol */ > +    Âgoto fail; > +  Â} > + >   bitmap_size = ((bs->total_sectors + 7) >> 3) + sizeof(cow_header); >   s->cow_sectors_offset = (bitmap_size + 511) & ~511; >   return 0; > diff --git a/block/qcow.c b/block/qcow.c > index a26c886..1e46bdb 100644 > --- a/block/qcow.c > +++ b/block/qcow.c > @@ -157,6 +157,11 @@ static int qcow_open(BlockDriverState *bs, int flags) >     if (bdrv_pread(bs->file, header.backing_file_offset, bs->backing_file, len) != len) >       goto fail; >     bs->backing_file[len] = '\0'; > + > +    Âif (bs->backing_file[0] != '\0' && bdrv_is_fd_protocol(bs)) { > +      Â/* backing file currently not supported by fd: protocol */ > +      Âgoto fail; > +    Â} >   } >   return 0; > > diff --git a/block/qcow2.c b/block/qcow2.c > index 8451ded..8b9c160 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -270,6 +270,11 @@ static int qcow2_open(BlockDriverState *bs, int flags) >       goto fail; >     } >     bs->backing_file[len] = '\0'; > + > +    Âif (bs->backing_file[0] != '\0' && bdrv_is_fd_protocol(bs)) { > +      Âret = -ENOTSUP; > +      Âgoto fail; > +    Â} >   } >   if (qcow2_read_snapshots(bs) < 0) { >     ret = -EINVAL; > diff --git a/block/qed.c b/block/qed.c > index 3970379..5028897 100644 > --- a/block/qed.c > +++ b/block/qed.c > @@ -446,6 +446,10 @@ static int bdrv_qed_open(BlockDriverState *bs, int flags) >       return ret; >     } > > +    Âif (bs->backing_file[0] != '\0' && bdrv_is_fd_protocol(bs)) { > +      Âreturn -ENOTSUP; > +    Â} > + >     if (s->header.features & QED_F_BACKING_FORMAT_NO_PROBE) { >       pstrcpy(bs->backing_format, sizeof(bs->backing_format), "raw"); >     } > diff --git a/block/raw-posix.c b/block/raw-posix.c > index 4cd7d7a..c72de3d 100644 > --- a/block/raw-posix.c > +++ b/block/raw-posix.c > @@ -28,6 +28,7 @@ > Â#include "block_int.h" > Â#include "module.h" > Â#include "block/raw-posix-aio.h" > +#include "monitor.h" > > Â#ifdef CONFIG_COCOA > Â#include <paths.h> > @@ -183,7 +184,8 @@ static int raw_open_common(BlockDriverState *bs, const char *filename, >              Âint bdrv_flags, int open_flags) > Â{ >   BDRVRawState *s = bs->opaque; > -  Âint fd, ret; > +  Âint fd = -1; > +  Âint ret; > >   ret = raw_normalize_devicepath(&filename); >   if (ret != 0) { > @@ -205,15 +207,17 @@ static int raw_open_common(BlockDriverState *bs, const char *filename, >   if (!(bdrv_flags & BDRV_O_CACHE_WB)) >     s->open_flags |= O_DSYNC; > > -  Âs->fd = -1; > -  Âfd = qemu_open(filename, s->open_flags, 0644); > -  Âif (fd < 0) { > -    Âret = -errno; > -    Âif (ret == -EROFS) > -      Âret = -EACCES; > -    Âreturn ret; > +  Âif (s->fd == -1) { > +    Âfd = qemu_open(filename, s->open_flags, 0644); > +    Âif (fd < 0) { > +      Âret = -errno; > +      Âif (ret == -EROFS) { > +        Âret = -EACCES; > +      Â} > +      Âreturn ret; > +    Â} > +    Âs->fd = fd; >   } > -  Âs->fd = fd; >   s->aligned_buf = NULL; > >   if ((bdrv_flags & BDRV_O_NOCACHE)) { > @@ -270,6 +274,7 @@ static int raw_open(BlockDriverState *bs, const char *filename, int flags) > Â{ >   BDRVRawState *s = bs->opaque; > > +  Âs->fd = -1; >   s->type = FTYPE_FILE; >   return raw_open_common(bs, filename, flags, 0); > Â} > @@ -890,6 +895,60 @@ static BlockDriver bdrv_file = { >   .create_options = raw_create_options, > Â}; > > +static int raw_open_fd(BlockDriverState *bs, const char *filename, int flags) > +{ > +  ÂBDRVRawState *s = bs->opaque; > +  Âconst char *fd_str; > +  Âint fd; > + > +  Â/* extract the file descriptor - fail if it's not fd: */ > +  Âif (!strstart(filename, "fd:", &fd_str)) { > +    Âreturn -EINVAL; > +  Â} > + > +  Âif (!qemu_isdigit(fd_str[0])) { > +    Â/* get fd from monitor */ > +    Âfd = qemu_get_fd(fd_str); > +    Âif (fd == -1) { > +      Âreturn -EBADF; > +    Â} > +  Â} else { > +    Âchar *endptr = NULL; > + > +    Âfd = strtol(fd_str, &endptr, 10); > +    Âif (*endptr || (fd == 0 && fd_str == endptr)) { > +      Âreturn -EBADF; > +    Â} > +  Â} > + > +  Âs->fd = fd; > +  Âs->type = FTYPE_FILE; > + > +  Âreturn raw_open_common(bs, filename, flags, 0); > +} > + > +static BlockDriver bdrv_file_fd = { > +  Â.format_name = "file", > +  Â.protocol_name = "fd", > +  Â.instance_size = sizeof(BDRVRawState), > +  Â.bdrv_probe = NULL, /* no probe for protocols */ > +  Â.bdrv_file_open = raw_open_fd, > +  Â.bdrv_read = raw_read, > +  Â.bdrv_write = raw_write, > +  Â.bdrv_close = raw_close, > +  Â.bdrv_flush = raw_flush, > +  Â.bdrv_discard = raw_discard, > + > +  Â.bdrv_aio_readv = raw_aio_readv, > +  Â.bdrv_aio_writev = raw_aio_writev, > +  Â.bdrv_aio_flush = raw_aio_flush, > + > +  Â.bdrv_truncate = raw_truncate, > +  Â.bdrv_getlength = raw_getlength, > + > +  Â.create_options = raw_create_options, > +}; > + > Â/***********************************************/ > Â/* host device */ > > @@ -998,6 +1057,7 @@ static int hdev_open(BlockDriverState *bs, const char *filename, int flags) >   } > Â#endif > > +  Âs->fd = -1; >   s->type = FTYPE_FILE; > Â#if defined(__linux__) >   { > @@ -1168,6 +1228,7 @@ static int floppy_open(BlockDriverState *bs, const char *filename, int flags) >   BDRVRawState *s = bs->opaque; >   int ret; > > +  Âs->fd = -1; >   s->type = FTYPE_FD; > >   /* open will not fail even if no floppy is inserted, so add O_NONBLOCK */ > @@ -1280,6 +1341,7 @@ static int cdrom_open(BlockDriverState *bs, const char *filename, int flags) > Â{ >   BDRVRawState *s = bs->opaque; > > +  Âs->fd = -1; >   s->type = FTYPE_CD; > >   /* open will not fail even if no CD is inserted, so add O_NONBLOCK */ > @@ -1503,6 +1565,7 @@ static void bdrv_file_init(void) >   Â* Register all the drivers. ÂNote that order is important, the driver >   Â* registered last will get probed first. >   Â*/ > +  Âbdrv_register(&bdrv_file_fd); >   bdrv_register(&bdrv_file); >   bdrv_register(&bdrv_host_device); > Â#ifdef __linux__ > diff --git a/block/vmdk.c b/block/vmdk.c > index 922b23d..2ea808e 100644 > --- a/block/vmdk.c > +++ b/block/vmdk.c > @@ -353,6 +353,11 @@ static int vmdk_parent_open(BlockDriverState *bs) >       return -1; > >     pstrcpy(bs->backing_file, end_name - p_name + 1, p_name); > + > +    Âif (bs->backing_file[0] != '\0' && bdrv_is_fd_protocol(bs)) { > +      Â/* backing file currently not supported by fd: protocol */ > +      Âreturn -1; > +    Â} >   } > >   return 0; > diff --git a/block_int.h b/block_int.h > index fa91337..a305ee2 100644 > --- a/block_int.h > +++ b/block_int.h > @@ -152,6 +152,7 @@ struct BlockDriverState { >   int encrypted; /* if true, the media is encrypted */ >   int valid_key; /* if true, a valid encryption key has been set */ >   int sg;    Â/* if true, the device is a /dev/sg* */ > +  Âint fd_protocol; /* if true, the fd: protocol was specified */ bool? >   /* event callback when inserting/removing */ >   void (*change_cb)(void *opaque, int reason); >   void *change_opaque; > diff --git a/blockdev.c b/blockdev.c > index e81e0ab..a536c20 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -546,6 +546,10 @@ DriveInfo *drive_init(QemuOpts *opts, int default_to_scsi) > >   bdrv_flags |= ro ? 0 : BDRV_O_RDWR; > > +  Âif (strncmp(file, "fd:", 3) == 0) { > +    Âdinfo->bdrv->fd_protocol = 1; > +  Â} > + >   ret = bdrv_open(dinfo->bdrv, file, bdrv_flags, drv); >   if (ret < 0) { >     error_report("could not open disk image %s: %s", > @@ -606,6 +610,12 @@ int do_snapshot_blkdev(Monitor *mon, const QDict *qdict, QObject **ret_data) >     goto out; >   } > > +  Âif (bdrv_is_fd_protocol(bs)) { > +    Âqerror_report(QERR_UNSUPPORTED); > +    Âret = -1; > +    Âgoto out; > +  Â} > + >   pstrcpy(old_filename, sizeof(old_filename), bs->filename); > >   old_drv = bs->drv; > diff --git a/monitor.c b/monitor.c > index 59a3e76..ea60be2 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -2832,6 +2832,11 @@ int monitor_get_fd(Monitor *mon, const char *fdname) >   return -1; > Â} > > +int qemu_get_fd(const char *fdname) > +{ > +  Âreturn cur_mon ? monitor_get_fd(cur_mon, fdname) : -1; > +} > + > Âstatic const mon_cmd_t mon_cmds[] = { > Â#include "hmp-commands.h" >   { NULL, NULL, }, > diff --git a/monitor.h b/monitor.h > index 4f2d328..de5b987 100644 > --- a/monitor.h > +++ b/monitor.h > @@ -51,6 +51,7 @@ int monitor_read_bdrv_key_start(Monitor *mon, BlockDriverState *bs, >                 void *opaque); > > Âint monitor_get_fd(Monitor *mon, const char *fdname); > +int qemu_get_fd(const char *fdname); > > Âvoid monitor_vprintf(Monitor *mon, const char *fmt, va_list ap) >   GCC_FMT_ATTR(2, 0); > diff --git a/qemu-options.hx b/qemu-options.hx > index 1d5ad8b..f9b66f4 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -116,7 +116,7 @@ using @file{/dev/cdrom} as filename (@pxref{host_drives}). > ÂETEXI > > ÂDEF("drive", HAS_ARG, QEMU_OPTION_drive, > -  Â"-drive [file=file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n" > +  Â"-drive [file=[fd:]file][,if=type][,bus=n][,unit=m][,media=d][,index=i]\n" >   "    [,cyls=c,heads=h,secs=s[,trans=t]][,snapshot=on|off]\n" >   "    [,cache=writethrough|writeback|none|unsafe][,format=f]\n" >   "    [,serial=s][,addr=A][,id=name][,aio=threads|native]\n" > @@ -129,10 +129,12 @@ STEXI > ÂDefine a new drive. Valid options are: > > Â@table @option > -@item file=@var{file} > +@item file=[fd:]@var{file} > ÂThis option defines which disk image (@pxref{disk_images}) to use with > Âthis drive. If the filename contains comma, you must double it > -(for instance, "file=my,,file" to use file "my,file"). > +(for instance, "file=my,,file" to use file "my,file"). @option{fd:}@var{file} > +specifies the file descriptor of an already open disk image. > +@option{format=}@var{format} is required by @option{fd:}@var{file}. > Â@item if=@var{interface} > ÂThis option defines on which type on interface the drive is connected. > ÂAvailable types are: ide, scsi, sd, mtd, floppy, pflash, virtio. > diff --git a/qemu-tool.c b/qemu-tool.c > index 41e5c41..8fe6b8c 100644 > --- a/qemu-tool.c > +++ b/qemu-tool.c > @@ -96,3 +96,8 @@ int64_t qemu_get_clock_ns(QEMUClock *clock) > Â{ >   return 0; > Â} > + > +int qemu_get_fd(const char *fdname) > +{ > +  Âreturn -1; > +} > -- > 1.7.1 > > > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list