Re: [PATCH v5 00/15] vDPA shadow virtqueue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




在 2022/3/7 下午11:33, Eugenio Pérez 写道:
This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This
is intended as a new method of tracking the memory the devices touch
during a migration process: Instead of relay on vhost device's dirty
logging capability, SVQ intercepts the VQ dataplane forwarding the
descriptors between VM and device. This way qemu is the effective
writer of guests memory, like in qemu's virtio device operation.

When SVQ is enabled qemu offers a new virtual address space to the
device to read and write into, and it maps new vrings and the guest
memory in it. SVQ also intercepts kicks and calls between the device
and the guest. Used buffers relay would cause dirty memory being
tracked.

This effectively means that vDPA device passthrough is intercepted by
qemu. While SVQ should only be enabled at migration time, the switching
from regular mode to SVQ mode is left for a future series.

It is based on the ideas of DPDK SW assisted LM, in the series of
DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
not map the shadow vq in guest's VA, but in qemu's.

For qemu to use shadow virtqueues the guest virtio driver must not use
features like event_idx.

SVQ needs to be enabled with cmdline:

-netdev type=vhost-vdpa,vhostdev=vhost-vdpa-0,id=vhost-vdpa0,svq=on

The first three patches enables notifications forwarding with
assistance of qemu. It's easy to enable only this if the relevant
cmdline part of the last patch is applied on top of these.

Next four patches implement the actual buffer forwarding. However,
address are not translated from HVA so they will need a host device with
an iommu allowing them to access all of the HVA range.

The last part of the series uses properly the host iommu, so qemu
creates a new iova address space in the device's range and translates
the buffers in it. Finally, it adds the cmdline parameter.

Some simple performance tests with netperf were done. They used a nested
guest with vp_vdpa, vhost-kernel at L0 host. Starting with no svq and a
baseline average of ~9009.96Mbps:
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec
131072  16384  16384    30.01    9061.03
131072  16384  16384    30.01    8962.94
131072  16384  16384    30.01    9005.92

To enable SVQ buffers forwarding reduce throughput to about
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec
131072  16384  16384    30.01    7689.72
131072  16384  16384    30.00    7752.07
131072  16384  16384    30.01    7750.30

However, many performance improvements were left out of this series for
simplicity, so difference should shrink in the future.

Comments are welcome.


Hi Michael:

What do you think of this series? It looks good to me as a start. The feature could only be enabled as a dedicated parameter. If you're ok, I'd try to make it for 7.0.

Thanks



TODO on future series:
* Event, indirect, packed, and others features of virtio.
* To support different set of features between the device<->SVQ and the
   SVQ<->guest communication.
* Support of device host notifier memory regions.
* To sepparate buffers forwarding in its own AIO context, so we can
   throw more threads to that task and we don't need to stop the main
   event loop.
* Support multiqueue virtio-net vdpa.
* Proper documentation.

Changes from v4:
* Iterate iova->hva tree instead on maintain own tree so we support HVA
   overlaps.
* Fix: Errno completion at failure.
* Rename x-svq to svq, so changes to stable does not affect cmdline parameter.

Changes from v3:
* Add @unstable feature to NetdevVhostVDPAOptions.x-svq.
* Fix uncomplete mapping (by 1 byte) of memory regions if svq is enabled.
v3 link:
https://lore.kernel.org/qemu-devel/20220302203012.3476835-1-eperezma@xxxxxxxxxx/

Changes from v2:
* Less assertions and more error handling in iova tree code.
* Better documentation, both fixing errors and making @param: format
* Homogeneize SVQ avail_idx_shadow and shadow_used_idx to make shadow a
   prefix at both times.
* Fix: Fo not use VirtQueueElement->len field, track separatedly.
* Split vhost_svq_{enable,disable}_notification, so the code looks more
   like the kernel driver code.
* Small improvements.
v2 link:
https://lore.kernel.org/all/CAJaqyWfXHE0C54R_-OiwJzjC0gPpkE3eX0L8BeeZXGm1ERYPtA@xxxxxxxxxxxxxx/

Changes from v1:
* Feature set at device->SVQ is now the same as SVQ->guest.
* Size of SVQ is not max available device size anymore, but guest's
   negotiated.
* Add VHOST_FILE_UNBIND kick and call fd treatment.
* Make SVQ a public struct
* Come back to previous approach to iova-tree
* Some assertions are now fail paths. Some errors are now log_guest.
* Only mask _F_LOG feature at vdpa_set_features svq enable path.
* Refactor some errors and messages. Add missing error unwindings.
* Add memory barrier at _F_NO_NOTIFY set.
* Stop checking for features flags out of transport range.
v1 link:
https://lore.kernel.org/virtualization/7d86c715-6d71-8a27-91f5-8d47b71e3201@xxxxxxxxxx/

Changes from v4 RFC:
* Support of allocating / freeing iova ranges in IOVA tree. Extending
   already present iova-tree for that.
* Proper validation of guest features. Now SVQ can negotiate a
   different set of features with the device when enabled.
* Support of host notifiers memory regions
* Handling of SVQ full queue in case guest's descriptors span to
   different memory regions (qemu's VA chunks).
* Flush pending used buffers at end of SVQ operation.
* QMP command now looks by NetClientState name. Other devices will need
   to implement it's way to enable vdpa.
* Rename QMP command to set, so it looks more like a way of working
* Better use of qemu error system
* Make a few assertions proper error-handling paths.
* Add more documentation
* Less coupling of virtio / vhost, that could cause friction on changes
* Addressed many other small comments and small fixes.

Changes from v3 RFC:
   * Move everything to vhost-vdpa backend. A big change, this allowed
     some cleanup but more code has been added in other places.
   * More use of glib utilities, especially to manage memory.
v3 link:
https://lists.nongnu.org/archive/html/qemu-devel/2021-05/msg06032.html

Changes from v2 RFC:
   * Adding vhost-vdpa devices support
   * Fixed some memory leaks pointed by different comments
v2 link:
https://lists.nongnu.org/archive/html/qemu-devel/2021-03/msg05600.html

Changes from v1 RFC:
   * Use QMP instead of migration to start SVQ mode.
   * Only accepting IOMMU devices, closer behavior with target devices
     (vDPA)
   * Fix invalid masking/unmasking of vhost call fd.
   * Use of proper methods for synchronization.
   * No need to modify VirtIO device code, all of the changes are
     contained in vhost code.
   * Delete superfluous code.
   * An intermediate RFC was sent with only the notifications forwarding
     changes. It can be seen in
     https://patchew.org/QEMU/20210129205415.876290-1-eperezma@xxxxxxxxxx/
v1 link:
https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05372.html

Eugenio Pérez (20):
       virtio: Add VIRTIO_F_QUEUE_STATE
       virtio-net: Honor VIRTIO_CONFIG_S_DEVICE_STOPPED
       virtio: Add virtio_queue_is_host_notifier_enabled
       vhost: Make vhost_virtqueue_{start,stop} public
       vhost: Add x-vhost-enable-shadow-vq qmp
       vhost: Add VhostShadowVirtqueue
       vdpa: Register vdpa devices in a list
       vhost: Route guest->host notification through shadow virtqueue
       Add vhost_svq_get_svq_call_notifier
       Add vhost_svq_set_guest_call_notifier
       vdpa: Save call_fd in vhost-vdpa
       vhost-vdpa: Take into account SVQ in vhost_vdpa_set_vring_call
       vhost: Route host->guest notification through shadow virtqueue
       virtio: Add vhost_shadow_vq_get_vring_addr
       vdpa: Save host and guest features
       vhost: Add vhost_svq_valid_device_features to shadow vq
       vhost: Shadow virtqueue buffers forwarding
       vhost: Add VhostIOVATree
       vhost: Use a tree to store memory mappings
       vdpa: Add custom IOTLB translations to SVQ

Eugenio Pérez (15):
   vhost: Add VhostShadowVirtqueue
   vhost: Add Shadow VirtQueue kick forwarding capabilities
   vhost: Add Shadow VirtQueue call forwarding capabilities
   vhost: Add vhost_svq_valid_features to shadow vq
   virtio: Add vhost_svq_get_vring_addr
   vdpa: adapt vhost_ops callbacks to svq
   vhost: Shadow virtqueue buffers forwarding
   util: Add iova_tree_alloc_map
   util: add iova_tree_find_iova
   vhost: Add VhostIOVATree
   vdpa: Add custom IOTLB translations to SVQ
   vdpa: Adapt vhost_vdpa_get_vring_base to SVQ
   vdpa: Never set log_base addr if SVQ is enabled
   vdpa: Expose VHOST_F_LOG_ALL on SVQ
   vdpa: Add x-svq to NetdevVhostVDPAOptions

  qapi/net.json                      |   8 +-
  hw/virtio/vhost-iova-tree.h        |  27 ++
  hw/virtio/vhost-shadow-virtqueue.h |  87 ++++
  include/hw/virtio/vhost-vdpa.h     |   8 +
  include/qemu/iova-tree.h           |  38 +-
  hw/virtio/vhost-iova-tree.c        | 110 +++++
  hw/virtio/vhost-shadow-virtqueue.c | 637 +++++++++++++++++++++++++++++
  hw/virtio/vhost-vdpa.c             | 525 +++++++++++++++++++++++-
  net/vhost-vdpa.c                   |  48 ++-
  util/iova-tree.c                   | 169 ++++++++
  hw/virtio/meson.build              |   2 +-
  11 files changed, 1633 insertions(+), 26 deletions(-)
  create mode 100644 hw/virtio/vhost-iova-tree.h
  create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
  create mode 100644 hw/virtio/vhost-iova-tree.c
  create mode 100644 hw/virtio/vhost-shadow-virtqueue.c


_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux