[PATCH V4 RESEND 00/22] Multiqueue virtio-net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all:

This seires is an update of last version of multiqueue virtio-net support.

This series tries to brings multiqueue support to virtio-net through a
multiqueue support tap backend and multiple vhost threads.

Patch 1 converts bitfield in TAPState to bool. Patch 2 replace assert(0) with
abort() in tap.

To support this, multiqueue nic support were added to qemu. This is done by
introducing an array of NetClientStates in NICState, and make each pair of peers
to be an queue of the nic. This is done in patch 3-9.

Tap were also converted to be able to create a multiple queue
backend. Currently, only linux support this by issuing TUNSETIFF N times with
the same device name to create N queues. Each fd returned by TUNSETIFF were a
queue supported by kernel. Three new command lines were introduced, "queues"
were used to tell how many queues will be created by qemu; "fds" were used to
pass multiple pre-created tap file descriptors to qemu; "vhostfds" were used to
pass multiple pre-created vhost descriptors to qemu. This is done in patch 10-15.

A method of deleting a queue and queue_index were also introduce for virtio,
this is done in patch 16-17.

Vhost were also changed to support multiqueue by introducing a start vq index
which tracks the first virtqueue that will be used by vhost instead of the
assumption that the vhost always use virtqueue from index 0. This is done in
patch 18.

The last part is the multiqueue userspace changes, this is done in patch 19-22.

With this changes, user could start a multiqueue virtio-net device through

./qemu -netdev tap,id=hn0,queues=2,vhost=on -device virtio-net-pci,netdev=hn0

Management tools such as libvirt can pass multiple pre-created fds/vhostfds through

./qemu -netdev tap,id=hn0,fds=X:Y,vhostfds=M:N -device virtio-net-pci,netdev=hn0

For the one who wants to try, a git tree is available at:
git://github.com/jasowang/qemu.git

Changes from V4:
- fix the conflict with Michael's tree and rebase to the latest (Michael)

Changes from V3:
- convert bitfield to bool in TAPState (Blue)
- use abort() instead of assert(0) in tap code (Blue)
- rebase to the latest
- fix a bug that breaks the non-tap network

Changes from V2:
- Don't start/stop vhost threads when changing queues and simplify the interface
  between virtio-net and vhost further.

Changes from V1:
- silent checkpatch (Blue)
- use fds/vhostfds instead of fd/vhostfd (Stefan)
- use fds="X:Y:Z" instead of fd=X,fd=Y,fd=Z (Anthony)
- split patches (Stefan)
- typos in commit log (Stefan)
- Warn 'queues=' when fds/vhostfds is used (Stefan)
- rename __net_init_tap to net_init_tap_one (Stefan)
- check the consistency of vnet_hdr of multiple tap fds (Stefan)
- disable multiqueue support for bridge-helper (Stefan)
- rename tap_attach()/tap_detach() to tap_enable()/tap_disable() (Stefan)
- fix booting with legacy guest (WanLong)
- don't bump the version when doing migration (Michael)
- simplify the interface between virtio-net and multiqueue vhost_net (Michael)
- rebase the patches to latest
- re-order the patches that let the net part comes first to simplify the
  reviewing
- simplify the interface between virtio-net and multiqueue vhost_net
- move the guest notifiers setup from vhost to vhost_net
- fix a build issue of hw/mcf_fce.c

Changes from RFC v2:
- rebase the codes to latest qemu
- align the multiqueue virtio-net implementation to virtio spec
- split the patches into more smaller patches
- set_link and hotplug support

Changes from RFC V1:
- rebase to the latest
- fix memory leak in parse_netdev
- fix guest notifiers assignment/de-assignment
- changes the command lines to:
   qemu -netdev tap,queues=2 -device virtio-net-pci,queues=2

Reference:
V1: http://lists.nongnu.org/archive/html/qemu-devel/2012-12/msg03558.html
RFC v2: http://lists.gnu.org/archive/html/qemu-devel/2012-06/msg04108.html
RFC v1: http://comments.gmane.org/gmane.comp.emulators.qemu/100481

Perf Numbers:
- norm is short for normalize result
- trans.rate is short for transaction rate

Two Intel Xeon 5620 with direct connected intel 82599EB
Host/Guest kernel: David net tree
vhost enabled

- lots of improvents of both latency and cpu utilization in request-reponse test
- get regression of guest sending small packets which because TCP tends to batch
  less when the latency were improved

1q/2q/4q
TCP_RR
 size #sessions trans.rate  norm trans.rate  norm trans.rate  norm
1 1     9393.26   595.64  9408.18   597.34  9375.19   584.12
1 20    72162.1   2214.24 129880.22 2456.13 196949.81 2298.13
1 50    107513.38 2653.99 139721.93 2490.58 259713.82 2873.57
1 100   126734.63 2676.54 145553.5  2406.63 265252.68 2943
64 1    9453.42   632.33  9371.37   616.13  9338.19   615.97
64 20   70620.03  2093.68 125155.75 2409.15 191239.91 2253.32
64 50   106966    2448.29 146518.67 2514.47 242134.07 2720.91
64 100  117046.35 2394.56 190153.09 2696.82 238881.29 2704.41
256 1   8733.29   736.36  8701.07   680.83  8608.92   530.1
256 20  69279.89  2274.45 115103.07 2299.76 144555.16 1963.53
256 50  97676.02  2296.09 150719.57 2522.92 254510.5  3028.44
256 100 150221.55 2949.56 197569.3  2790.92 300695.78 3494.83
TCP_CRR
 size #sessions trans.rate  norm trans.rate  norm trans.rate  norm
1 1     2848.37  163.41 2230.39  130.89 2013.09  120.47
1 20    23434.5  562.11 31057.43 531.07 49488.28 564.41
1 50    28514.88 582.17 40494.23 605.92 60113.35 654.97
1 100   28827.22 584.73 48813.25 661.6  61783.62 676.56
64 1    2780.08  159.4  2201.07  127.96 2006.8   117.63
64 20   23318.51 564.47 30982.44 530.24 49734.95 566.13
64 50   28585.72 582.54 40576.7  610.08 60167.89 656.56
64 100  28747.37 584.17 49081.87 667.87 60612.94 662
256 1   2772.08  160.51 2231.84  131.05 2003.62  113.45
256 20  23086.35 559.8  30929.09 528.16 48454.9  555.22
256 50  28354.7  579.85 40578.31 607    60261.71 657.87
256 100 28844.55 585.67 48541.86 659.08 61941.07 676.72
TCP_STREAM guest receiving
 size #sessions throughput  norm throughput  norm throughput  norm
1 1     16.27   1.33   16.1    1.12   16.13   0.99
1 2     33.04   2.08   32.96   2.19   32.75   1.98
1 4     66.62   6.83   68.3    5.56   66.14   2.65
64 1    896.55  56.67  914.02  58.14  898.9   61.56
64 2    1830.46 91.02  1812.02 64.59  1835.57 66.26
64 4    3626.61 142.55 3636.25 100.64 3607.46 75.03
256 1   2619.49 131.23 2543.19 129.03 2618.69 132.39
256 2   5136.58 203.02 5163.31 141.11 5236.51 149.4
256 4   7063.99 242.83 9365.4  208.49 9421.03 159.94
512 1   3592.43 165.24 3603.12 167.19 3552.5  169.57
512 2   7042.62 246.59 7068.46 180.87 7258.52 186.3
512 4   6996.08 241.49 9298.34 206.12 9418.52 159.33
1024 1  4339.54 192.95 4370.2  191.92 4211.72 192.49
1024 2  7439.45 254.77 9403.99 215.24 9120.82 222.67
1024 4  7953.86 272.11 9403.87 208.23 9366.98 159.49
4096 1  7696.28 272.04 7611.41 270.38 7778.71 267.76
4096 2  7530.35 261.1  8905.43 246.27 8990.18 267.57
4096 4  7121.6  247.02 9411.75 206.71 9654.96 184.67
16384 1 7795.73 268.54 7780.94 267.2  7634.26 260.73
16384 2 7436.57 255.81 9381.86 220.85 9392    220.36
16384 4 7199.07 247.81 9420.96 205.87 9373.69 159.57
TCP_MAERTS guest sending
 size #sessions throughput  norm throughput  norm throughput  norm
1 1     15.94   0.62   15.55   0.61   15.13   0.59
1 2     36.11   0.83   32.46   0.69   32.28   0.69
1 4     71.59   1      68.91   0.94   61.52   0.77
64 1    630.71  22.52  622.11  22.35  605.09  21.84
64 2    1442.36 30.57  1292.15 25.82  1282.67 25.55
64 4    3186.79 42.59  2844.96 36.03  2529.69 30.06
256 1   1760.96 58.07  1738.44 57.43  1695.99 56.19
256 2   4834.23 95.19  3524.85 64.21  3511.94 64.45
256 4   9324.63 145.74 8956.49 116.39 6720.17 73.86
512 1   2678.03 84.1   2630.68 82.93  2636.54 82.57
512 2   9368.17 195.61 9408.82 204.53 5316.3  92.99
512 4   9186.34 209.68 9358.72 183.82 9489.29 160.42
1024 1  3620.71 109.88 3625.54 109.83 3606.61 112.35
1024 2  9429    258.32 7082.79 120.55 7403.53 134.78
1024 4  9430.66 290.44 9499.29 232.31 9414.6  190.92
4096 1  9339.28 296.48 9374.23 372.88 9348.76 298.49
4096 2  9410.53 378.69 9412.61 286.18 9409.75 278.31
4096 4  9487.35 374.1  9556.91 288.81 9441.94 221.64
16384 1 9380.43 403.8  9379.78 399.13 9382.42 393.55
16384 2 9367.69 406.93 9415.04 312.68 9409.29 300.9
16384 4 9391.96 405.17 9695.12 310.54 9423.76 223.47

Jason Wang (22):
  net: tap: using bool instead of bitfield
  net: tap: use abort() instead of assert(0)
  net: introduce qemu_get_queue()
  net: introduce qemu_get_nic()
  net: intorduce qemu_del_nic()
  net: introduce qemu_find_net_clients_except()
  net: introduce qemu_net_client_setup()
  net: introduce NetClientState destructor
  net: multiqueue support
  tap: import linux multiqueue constants
  tap: factor out common tap initialization
  tap: add Linux multiqueue support
  tap: support enabling or disabling a queue
  tap: introduce a helper to get the name of an interface
  tap: multiqueue support
  vhost: multiqueue support
  virtio: introduce virtio_del_queue()
  virtio: add a queue_index to VirtQueue
  virtio-net: separate virtqueue from VirtIONet
  virtio-net: multiqueue support
  virtio-net: migration support for multiqueue
  virtio-net: compat multiqueue support

 hw/cadence_gem.c            |   17 +-
 hw/dp8393x.c                |   17 +-
 hw/e1000.c                  |   32 ++--
 hw/eepro100.c               |   18 +-
 hw/etraxfs_eth.c            |   11 +-
 hw/lan9118.c                |   16 +-
 hw/lance.c                  |    2 +-
 hw/mcf_fec.c                |   12 +-
 hw/milkymist-minimac2.c     |   10 +-
 hw/mipsnet.c                |   10 +-
 hw/musicpal.c               |    6 +-
 hw/ne2000-isa.c             |    4 +-
 hw/ne2000.c                 |   13 +-
 hw/opencores_eth.c          |   12 +-
 hw/pc_piix.c                |    4 +
 hw/pcnet-pci.c              |    4 +-
 hw/pcnet.c                  |   13 +-
 hw/qdev-properties-system.c |   46 ++++-
 hw/qdev-properties.h        |    6 +-
 hw/rtl8139.c                |   22 +-
 hw/smc91c111.c              |   10 +-
 hw/spapr_llan.c             |    8 +-
 hw/stellaris_enet.c         |   11 +-
 hw/usb/dev-network.c        |   16 +-
 hw/vhost.c                  |   81 +++----
 hw/vhost.h                  |    2 +
 hw/vhost_net.c              |   86 +++++++-
 hw/vhost_net.h              |    4 +-
 hw/virtio-net.c             |  507 +++++++++++++++++++++++++++++++------------
 hw/virtio-net.h             |   27 +++-
 hw/virtio.c                 |   17 ++
 hw/virtio.h                 |    3 +
 hw/xen_nic.c                |   17 +-
 hw/xgmac.c                  |   10 +-
 hw/xilinx_axienet.c         |   10 +-
 hw/xilinx_ethlite.c         |   12 +-
 include/net/net.h           |   26 ++-
 include/net/tap.h           |    6 +-
 net/net.c                   |  198 ++++++++++++++----
 net/tap-aix.c               |   18 ++-
 net/tap-bsd.c               |   18 ++-
 net/tap-haiku.c             |   18 ++-
 net/tap-linux.c             |   71 ++++++-
 net/tap-linux.h             |    4 +
 net/tap-solaris.c           |   18 ++-
 net/tap-win32.c             |   18 ++-
 net/tap.c                   |  333 ++++++++++++++++++++--------
 net/tap_int.h               |    6 +-
 qapi-schema.json            |    5 +-
 savevm.c                    |    2 +-
 50 files changed, 1325 insertions(+), 512 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux