Am 23.07.2012 15:08, schrieb Corey Bryant: > When qemu_open is passed a filename of the "/dev/fdset/nnn" > format (where nnn is the fdset ID), an fd with matching access > mode flags will be searched for within the specified monitor > fd set. If the fd is found, a dup of the fd will be returned > from qemu_open. > > Each fd set has a reference count. The purpose of the reference > count is to determine if an fd set contains file descriptors that > have open dup() references that have not yet been closed. It is > incremented on qemu_open and decremented on qemu_close. It is > not until the refcount is zero that file desriptors in an fd set > can be closed. If an fd set has dup() references open, then we > must keep the other fds in the fd set open in case a reopen > of the file occurs that requires an fd with a different access > mode. > > Signed-off-by: Corey Bryant <coreyb@xxxxxxxxxxxxxxxxxx> > > v2: > -Get rid of file_open and move dup code to qemu_open > (kwolf@xxxxxxxxxx) > -Use strtol wrapper instead of atoi (kwolf@xxxxxxxxxx) > > v3: > -Add note about fd leakage (eblake@xxxxxxxxxx) > > v4 > -Moved patch to be later in series (lcapitulino@xxxxxxxxxx) > -Update qemu_open to check access mode flags and set flags that > can be set (eblake@xxxxxxxxxx, kwolf@xxxxxxxxxx) > > v5: > -This patch was overhauled quite a bit in this version, with > the addition of fd set and refcount support. > -Use qemu_set_cloexec() on dup'd fd (eblake@xxxxxxxxxx) > -Modify flags set by fcntl on dup'd fd (eblake@xxxxxxxxxx) > -Reduce syscalls when setting flags for dup'd fd (eblake@xxxxxxxxxx) > -Fix O_RDWR, O_RDONLY, O_WRONLY checks (eblake@xxxxxxxxxx) > --- > block/raw-posix.c | 24 +++++----- > block/raw-win32.c | 2 +- > block/vmdk.c | 4 +- > block/vpc.c | 2 +- > block/vvfat.c | 12 ++--- > cutils.c | 5 ++ > monitor.c | 85 +++++++++++++++++++++++++++++++++ > monitor.h | 4 ++ > osdep.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++++- > qemu-common.h | 3 +- > qemu-tool.c | 12 +++++ > 11 files changed, 267 insertions(+), 24 deletions(-) > > diff --git a/block/raw-posix.c b/block/raw-posix.c > index a172de3..5d0a801 100644 > --- a/block/raw-posix.c > +++ b/block/raw-posix.c > @@ -271,7 +271,7 @@ static int raw_open_common(BlockDriverState *bs, const char *filename, > out_free_buf: > qemu_vfree(s->aligned_buf); > out_close: > - qemu_close(fd); > + qemu_close(fd, filename); > return -errno; > } Hm, not a nice interface where qemu_close() needs the filename and (worse) could be given a wrong filename. Maybe it would be better to maintain a list of fd -> fdset mappings in qemu_open/close? But if we decided to keep it like this, please use the right interface from the beginning in patch 5 instead of updating it here. > @@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool in_use) > } > } > > +void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id) > +{ > + mon_fdset_t *mon_fdset; > + > + if (!mon) { > + return; > + } > + > + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) { > + if (mon_fdset->id == fdset_id) { > + mon_fdset->refcount++; > + break; > + } > + } > +} > + > +void monitor_fdset_decrement_refcount(Monitor *mon, int64_t fdset_id) > +{ > + mon_fdset_t *mon_fdset; > + > + if (!mon) { > + return; > + } > + > + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) { > + if (mon_fdset->id == fdset_id) { > + mon_fdset->refcount--; > + if (mon_fdset->refcount == 0) { > + monitor_fdset_cleanup(mon_fdset); > + } > + break; > + } > + } > +} These two functions are almost the same. Would a monitor_fdset_update_refcount(mon, fdset_id, value) make sense? These functions could then be kept as thin wrappers around it, or they could even be dropped completely. > + > +int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags) > +{ > + mon_fdset_t *mon_fdset; > + mon_fdset_fd_t *mon_fdset_fd; > + int mon_fd_flags; > + > + if (!mon) { > + errno = ENOENT; > + return -1; > + } > + > + QLIST_FOREACH(mon_fdset, &mon->fdsets, next) { > + if (mon_fdset->id != fdset_id) { > + continue; > + } > + QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) { > + if (mon_fdset_fd->removed) { > + continue; > + } > + > + mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL); > + if (mon_fd_flags == -1) { > + return -1; > + } > + > + switch (flags & O_ACCMODE) { > + case O_RDWR: > + if ((mon_fd_flags & O_ACCMODE) == O_RDWR) { > + return mon_fdset_fd->fd; > + } > + break; > + case O_RDONLY: > + if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) { > + return mon_fdset_fd->fd; > + } > + break; > + case O_WRONLY: > + if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) { > + return mon_fdset_fd->fd; > + } > + break; > + } I think you mean: if ((flags & O_ACCMODE) == (mon_fd_flags & O_ACCMODE)) { return mon_fdset_fd->fd; } What about other flags that cannot be set with fcntl(), like O_SYNC on older kernels or possibly non-Linux? (The block layer doesn't use it any more, but I think we want to keep the function generally useful) > + } > + errno = EACCES; > + return -1; > + } > + errno = ENOENT; > + return -1; > +} > + > /* mon_cmds and info_cmds would be sorted at runtime */ > static mon_cmd_t mon_cmds[] = { > #include "hmp-commands.h" > @@ -75,6 +76,79 @@ int qemu_madvise(void *addr, size_t len, int advice) > #endif > } > > +/* > + * Dups an fd and sets the flags > + */ > +static int qemu_dup(int fd, int flags) > +{ > + int i; > + int ret; > + int serrno; > + int dup_flags; > + int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, > + O_NONBLOCK, 0 }; > + > + if (flags & O_CLOEXEC) { > + ret = fcntl(fd, F_DUPFD_CLOEXEC, 0); > + if (ret == -1 && errno == EINVAL) { > + ret = dup(fd); > + if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) { > + goto fail; > + } > + } > + } else { > + ret = dup(fd); > + } > + > + if (ret == -1) { > + goto fail; > + } > + > + dup_flags = fcntl(ret, F_GETFL); > + if (dup_flags == -1) { > + goto fail; > + } > + > + if ((flags & O_SYNC) != (dup_flags & O_SYNC)) { > + errno = EINVAL; > + goto fail; > + } It's worth trying to set it before failing, newer kernels can do it. But as I said above, if you can fail here, it makes sense to consider O_SYNC when selecting the right file descriptor from the fdset. > + /* Set/unset flags that we can with fcntl */ > + i = 0; > + while (setfl_flags[i] != 0) { > + if (flags & setfl_flags[i]) { > + dup_flags = (dup_flags | setfl_flags[i]); > + } else { > + dup_flags = (dup_flags & ~setfl_flags[i]); > + } > + i++; > + } What about this instead of the loop: int setfl_flags = O_APPEND | O_ASYNC | ... ; dup_flags &= ~setfl_flags; dup_flags |= (flags & setfl_flags); > + > + if (fcntl(ret, F_SETFL, dup_flags) == -1) { > + goto fail; > + } > + > + /* Truncate the file in the cases that open() would truncate it */ > + if (flags & O_TRUNC || > + ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) { > + if (ftruncate(ret, 0) == -1) { > + goto fail; > + } > + } O_CREAT | O_EXCL kind of loses its meaning here, but okay, it's hard to do better with file descriptors. > + > + qemu_set_cloexec(ret); Wait... If O_CLOEXEC is set, you set the flag immediately and if it isn't you set it at the end of the function? What's the intended use case for not using O_CLOEXEC then? Kevin -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list