[PATCH V1 0/8] vfio virtual address update redo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Re-architect the interfaces that allow the underlying memory object of an
iova range to be mapped in a new address space.  The old interfaces allow
userland to indefinitely block vfio mediated device kernel threads, and do
not propagate the locked_vm count to a new mm.

Interface changes:
  - disable the VFIO_UPDATE_VADDR extension
  - delete VFIO_DMA_UNMAP_FLAG_VADDR
  - redefine VFIO_DMA_MAP_FLAG_VADDR

New interfaces:
  - VFIO_CHANGE_DMA_OWNER iommu driver ioctl
  - VFIO_DMA_OWNER extension, consisting of VFIO_CHANGE_DMA_OWNER and the
    redefined VFIO_DMA_MAP_FLAG_VADDR.

VFIO_DMA_MAP_FLAG_VADDR changes the base virtual address for a dma mapping.
It is called after exec, after the application remaps the corresponding shared
memory object that was preserved across exec.  However, the change does not
take effect until VFIO_CHANGE_DMA_OWNER is called.  This allows the application
to iterate and register a new vaddr for all dma's, and have them take effect
atomically.

VFIO_CHANGE_DMA_OWNER changes the task and mm for all dma mappings to that of
the caller, and transfers the locked_vm count from the old to the new mm.  The
vaddr for each mapping must either be the same in the old and new mm (eg after
fork), or must have been updated with VFIO_DMA_MAP_FLAG_VADDR (eg after exec).
Subsequently, the caller is the only task that is allowed to pin pages for dma.
This prevents an application from exceeding the initial task's RLIMIT_MEMLOCK
by fork'ing and pinning in children.

These interfaces can be used to implement live update, in which a process such
as qemu exec's an updated version of itself, while preserving its guest and
vfio devices.  The application must preserve the vfio descriptors across fork
and exec, and must not start each step below until the previous step has
finished.

  parent				      child

  1. fork
					      2. ioctl(VFIO_CHANGE_DMA_OWNER)
  3. exec new binary

  4. foreach dma mapping
       va = mmap()
       ioctl(, VFIO_DMA_MAP_FLAG_VADDR, va)

     ioctl(VFIO_CHANGE_DMA_OWNER)

					      5. exit

With this arrangement, the dma mappings are always associated with a valid
mm, and mediated device requests such as vfio_pin_pages and vfio_dma_rw block
only briefly during the ioctls.  Thanks to Jason Gunthorpe for suggesting fork
and the change mm ioctl.

Lastly, if a task exits or execs, and it still owns any dma mappings, they
are unmapped and unpinned.  This guarantees that pages do not remain pinned
indefinitely if a vfio descriptor is leaked to another process, and requires
tasks to explicitly transfer ownership of dma (and hence locked_vm) to a new
task and mm when continued operation is desired.  The vfio driver maps a
special vma so it can detect exit and exec, via the vm_operations_struct
close callback.

Steve Sistare (8):
  vfio: delete interfaces to update vaddr
  vfio/type1: dma owner permission
  vfio: close dma owner
  vfio/type1: close dma owner
  vfio/type1: track locked_vm per dma
  vfio/type1: update vaddr
  vfio: change dma owner
  vfio/type1: change dma owner

 drivers/vfio/container.c        | 169 ++++++++++++++++++-
 drivers/vfio/vfio.h             |   9 +-
 drivers/vfio/vfio_iommu_type1.c | 362 +++++++++++++++++++++++-----------------
 include/uapi/linux/vfio.h       |  54 ++++--
 4 files changed, 412 insertions(+), 182 deletions(-)

-- 
1.8.3.1




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux