Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 07/02/2012 06:02 PM, Corey Bryant wrote:


On 06/26/2012 06:54 PM, Eric Blake wrote:
On 06/26/2012 04:28 PM, Corey Bryant wrote:

With this proposed series, we have usage akin to:

    1. pass_fd FDSET={M} -> returns a string "/dev/fd/N" showing
QEMU's
       view of the FD
    2. drive_add file=/dev/fd/N
    3. if failure:
         close_fd "/dev/fd/N"

In fact, there are more steps:

4. use it successfully
5. close_fd "/dev/fd/N"

I think it would well be possible that qemu just closes the fd when
it's
not used internally any more.

pass-fd could have a flag indicating qemu to do that.


It seems like libvirt would be in a better position to understand when a
file is no longer in use, and then it can call close_fd.  No?  Of course
the the only fd that needs to be closed is the originally passed fd. The
dup'd fd's are closed by QEMU.

The more we discuss this, the more I'm convinced that commands like
'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
will need to have an optional argument that says the name of the file to
reopen.  That is, I've seen three proposals:

Proposal One: enhance every command to take an fd:name protocol.
PRO: Successful use of a name removes the fd from the 'getfd' list.
CON: Lots of commands to change.
CON: The command sequence is not atomic.
CON: Block layer does not have a file name tied to the fd, so the reopen
operation also needs to learn the fd:name protocol, but since we're
already touching the command it's not much more work.
USAGE:
1. getfd FDSET={M}, ties new fd to "name"
2. drive_add fd=name - looks up the fd by name from the list
3a. on success, fd is no longer in the list, drive_add consumed it
3b. on failure, libvirt must call 'closefd name' to avoid a leak
4. getfd FDSET={M2}, ties new fd to "newname"
5. block-commit fd=newname - looks up fd by name from the list
6a. on success, fd is no longer in the list, block-commit consumed it
6b. on failure, libvirt must call 'closefd newname' to avoid a leak

Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then
update that name to the new fd in advance of any operation that needs to
do a reopen.
PRO: All other commands work without impact by using qemu_open(), by
getting a clone of the permanent name.
CON: The permanent name must remain open as long as qemu might need to
reopen it, and reopening needs the pass-fd force option.
CON: The command sequence is not atomic.
USAGE:
1. pass_fd FDSET={M} -> returns an integer N (or a string "/dev/fd/N")
showing QEMU's permanent name of the FD
2. drive_add file=<permanent name> means that qemu_open() will clone the
fd, and drive_add is now using yet another FD while N remains open;
meanwhile, the block layer knows that the drive is tied to the permanent
name
3. pass-fd force FDSET={M2} -> replaces fd N with the passed M2, and
still returns /dev/fd/N
4. block-commit - looks up the block-device name (/dev/fd/N), which maps
back to the permanent fd, and gets a copy of the newly passed M2
5. on completion (success or failure), libvirt must call 'closefd name'
to avoid a leak

Proposal Three: Use thread-local fds passed alongside any command,
rather than passing via a separate command
PRO: All commands can now atomically handle fd passing
PRO: Commands that only need a one-shot open can now use fd
CON: Commands that need to reopen still need modification to allow a
name override during the reopen
1. drive_add nfds=1 file="/dev/fdlist/1" FDSET={M} -> on success, the fd
is used as the block file, on failure it is atomically closed, so there
is no leak either way. However, the block file has no permanent name.
2. block-commit nfds=1 file-override="/dev/fdlist/1" FDSET={M2} -> file
override must be passed again (since we have no guarantee that the
/dev/fdlist/1 of _this_ command matches the command-local naming used in
the previous command), but the fd is again used atomically

Under proposal 3, there is no need to dup fd's, merely only to check
that qemu_open("/dev/fdlist/n", flags) has compatible flags with the fd
received via FDSET.  And the fact that things are made atomic means that
libvirt no longer has to worry about calling closefd, nor does it have
to worry about being interrupted between two monitor commands and not
knowing what state qemu is in.  But it does mean teaching every possible
command that wants to do a reopen to learn how to use fd overrides
rather than to reuse the permanent name that was originally in place on
the first open.


Here's another option that Kevin and I discussed today on IRC.  I've
modified a few minor details since the discussion. And Kevin please
correct me if anything is wrong.

Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
should all refer to the same file, but may have different access flags
(ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
matching access mode flags.
pass-fds:
     { 'command': 'pass-fds',
       'data': {'fdsetname': 'str', '*close': 'bool'},
       'returns': 'string' }
close-fds:
     { 'command': 'close-fds',
       'data': {'fdsetname': 'str' }
where:
      @fdsetname - the file descriptor set name
      @close - *if true QEMU closes the monitor fds when they expire.
               if false, fds remain on the monitor until close-fds
               command.
PRO: Supports reopen
PRO: All other commands work without impact by using qemu_open()
PRO: No fd leakage if close==true specified
CON: Determining when to close fds when close==true may be tricky
USAGE:
1. pass-fd FDSET={M} -> returns a string "/dev/fdset/1") that refers to
the passed set of fds.
2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
list that has access flags matching the qemu_open() action flags.
3. block-commit -- qemu_open() reopens "/dev/fdset/1" by using the first
fd from the list that has access flags matching the qemu_open() action
flags.
4. The monitor fds are closed:
    A) *If @close == true, qemu closes the set of fds when the timer
       expires
    B) If @close == false, libvirt must issue close-fds command to close
       the set of fds

*How to solve this has yet to be determined.  Kevin mentioned
potentially using a bottom-half or a timer in qemu_close().


Thanks again for taking time to discuss this at today's QEMU community call.

Here's the proposal we discussed at the call. Please let me know if I missed anything or if there are any issues with this design.

Proposal Five: New monitor commands enable adding/removing an fd to/from a set. The set of fds should all refer to the same file, but may have different access flags (ie. O_RDWR, O_RDONLY). qemu_open can then dup the fd that has the matching access mode flags.
PRO: Supports reopen
PRO: All other commands work without impact by using qemu_open()
PRO: No fd leakage (fds are associated with monitor connection and, if not in use, closed when monitor disconnects) PRO: Security-wise this is ok since libvirt can manage the set of fd's (ie. it can add/remove an O_RDWR fd to/from the set as needed).
CON: Not atomic (e.g. doesn't add an fd with single drive_add command).
USAGE:
1. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named "/dev/fdset/1" - command returns qemu fd (e.g fd=4) to caller. libvirt in-use flag turned on for fd.
2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from the
set that has access flags matching the qemu_open action flags. qemu_open increments refcount for this fd. 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5). libvirt in-use flag turned on for fd.
3. block-commit -> qemu_open reopens "/dev/fdset/1" by using the first
fd from the set that has access flags matching the qemu_open action flags. qemu_open increments refcount for this fd. 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from the set. turns libvirt in-use flag off marking the fd ready to be closed when qemu is done with it. 5. qemu_close decrements refcount for fd, and closes fd when refcount is zero and libvirt in use flag is off.

More functional details:
-If libvirt crashes it can call "query-fd /dev/fdset/1" to determine which fds are open in the set. -If monitor connection closes, qemu will close fds that have a refcount of zero. Do we also need a qemu in-use flag in case refcount is zero and fd is still in use? -This support requires introduction of qemu_close function that will be called everywhere in block layer that close is currently called.

Notes:
-Patch series 1 will include support for all of the above. This will be my initial focus. -Patch series 2 will include command line support that enables association of command line fd with a monitor set. This will be my secondary focus, most likely after patch series 1 is applied.


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list


[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]