On Tue, Jun 26, 2012 at 02:40:03PM -0400, Corey Bryant wrote: > > > On 06/26/2012 11:37 AM, Corey Bryant wrote: > > > > > >On 06/26/2012 11:03 AM, Daniel P. Berrange wrote: > >>On Tue, Jun 26, 2012 at 03:11:40PM +0100, Daniel P. Berrange wrote: > >>>On Tue, Jun 26, 2012 at 09:52:51AM -0400, Corey Bryant wrote: > >>>>>So now from a client's POV you'd have a flow like > >>>>> > >>>>> * drive_add "file=/dev/fd/N" FDSET={N} > >>>> > >>>>IIUC then drive_add would loop and pass each fd in the set via > >>>>SCM_RIGHTS? > >>> > >>>Yes, you'd probably use the JSON to tell QEMU exactly > >>>how many FDs you're intending to pass with the command, > >>>and then once the command is received, you'd have N*SCM_RIGHTS > >>>messages sent/received. > >>> > >>>>> > >>>>>And in QEMU you'd have something like > >>>>> > >>>>> * handle_monitor_command > >>>>> - recvmsg all FDs, and stash them in a thread local > >>>>>"FDContext" > >>>>> context > >>>>> - invoke monitor command handler > >>>>> - Sees file=/dev/fd/N > >>>>> - Fetch /dev/fd/N from "FDContext" > >>>>> - If success remove /dev/fd/N from "FDContext" > >>>>> - close() all FDs left in "FDContext" > >>>>> > >>>>>The key point with this is that because the FDs are directly > >>>>>associated with a monitor command, QEMU can /guarantee/ that > >>>>>FDs are never leaked, regardless of client behaviour. > >>>> > >>>>Wouldn't this leak fds if libvirt crashed part way through sending > >>>>the set of fds? > >>> > >>>No, because you've scoped the FDs to the current monitor instance, > >>>and the current command being processed you know to close all FDs > >>>when the associated command has finished, as well as closing them > >>>all if you see EOF on the monitor socket while in the middle of > >>>receiving a command. > >> > >>Here is a quick proof of concept (ie untested) patch to demonstrate > >>what I mean. It relies on Cory's patch which converts everything > >>to use qemu_open. It is also still valuable to make the change > >>to qemu_open() to support "/dev/fd/N" for passing FDs during > >>QEMU initial startup for CLI args. > >> > >>IMHO, what I propose here is preferrable for QMP clients that > >>our current plan of requiring use of 3 monitor commands (passfd, > >>XXXXX, closefd). > > > >Thanks for the PoC. > > > >Two other required updates that I can think of would be: > > > >1) Update change, block_stream, block_reopen, snapshot_blkdev, and > >perhaps other monitor commands to support receiving fd's via SCM_RIGHTS. > > > > Nevermind my comment. I see that your PoC supports passing nfds for > any QMP command. > > I'm curious what Kevin's thoughts are on this and the overall approach. > > >2) Support re-opening files with different access modes (O_RDONLY, > >O_WRONLY, or O_RDWR). This would be similar to the force option for > >pass-fd. > > > > I'm still not quite sure how we'd go about this. We need a way to > determine the existing QEMU fd that is to be re-associated with the > new fd, when using a /dev/fdlist/0 filename. In this new approach, > 0 corresponds to an index, not an fd. The prior approach of using > the /dev/fd/nnn, where nnn corresponded to an actual QEMU fd, made > this easy. Hmm, I'm not sure I understand what the use case is for the 'force' parameter ? In my proof of concept I left out the block of code from qemu_open() you had for dup'ing the FD with various different flags set, but that was just for brevity. I think it ought to fit in the same way, unless you're refering to a different area of the code. >> >>@@ -83,6 +158,26 @@ int qemu_open(const char *name, int flags, ...) > >> { > >> int ret; > >> int mode = 0; > >>+ const char *p; > >>+ > >>+ /* Attempt dup of fd for pre-opened file */ > >>+ if (strstart(name, "/dev/fdlist/", &p)) { > >>+ int idx; > >>+ int fd; > >>+ > >>+ idx = qemu_parse_fd(p); > >>+ if (idx == -1) { > >>+ return -1; > >>+ } > >>+ > >>+ fd = qemu_fdlist_dup(idx); > >>+ if (fd == -1) { > >>+ return -1; > >>+ } > >>+ > >>+ /* XXX verify flags match */ eg, this part of the code you had some work related to 'flags' > >>+ return fd; > >>+ } > >> > >> if (flags & O_CREAT) { > >> va_list ap; Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list