On Wed, Dec 12, 2018 at 05:25:50PM +0800, jiangyiwen wrote: > Now vsock only support send/receive small packet, it can't achieve > high performance. As previous discussed with Jason Wang, I revisit the > idea of vhost-net about mergeable rx buffer and implement the mergeable > rx buffer in vhost-vsock, it can allow big packet to be scattered in > into different buffers and improve performance obviously. > > This series of patches mainly did three things: > - mergeable buffer implementation > - increase the max send pkt size > - add used and signal guest in a batch > > And I write a tool to test the vhost-vsock performance, mainly send big > packet(64K) included guest->Host and Host->Guest. I test performance > independently and the result as follows: > > Before performance: > Single socket Multiple sockets(Max Bandwidth) > Guest->Host ~400MB/s ~480MB/s > Host->Guest ~1450MB/s ~1600MB/s > > After performance only use implement mergeable rx buffer: > Single socket Multiple sockets(Max Bandwidth) > Guest->Host ~400MB/s ~480MB/s > Host->Guest ~1280MB/s ~1350MB/s > > In this case, max send pkt size is still limited to 4K, so Host->Guest > performance will worse than before. It's concerning though, what if application sends small packets? What is the source of the slowdown? Do you know? > After performance increase the max send pkt size to 64K: > Single socket Multiple sockets(Max Bandwidth) > Guest->Host ~1700MB/s ~2900MB/s > Host->Guest ~1500MB/s ~2440MB/s > > After performance all patches are used: > Single socket Multiple sockets(Max Bandwidth) > Guest->Host ~1700MB/s ~2900MB/s > Host->Guest ~1700MB/s ~2900MB/s > > >From the test results, the performance is improved obviously, and guest > memory will not be wasted. > > In addition, in order to support mergeable rx buffer in virtio-vsock, > we need to add a qemu patch to support parse feature. > > --- > v1 -> v2: > * Addressed comments from Jason Wang. > * Add performance test result independently. > * Use Skb_page_frag_refill() which can use high order page and reduce > the stress of page allocator. > * Still use fixed size(PAGE_SIZE) to fill rx buffer, because too small > size can't fill one full packet, we only 128 vq num now. > * Use iovec to replace buf in struct virtio_vsock_pkt, keep tx and rx > consistency. > * Add virtio_transport ops to get max pkt len, in order to be compatible > with old version. > --- > > Yiwen Jiang (5): > VSOCK: support fill mergeable rx buffer in guest > VSOCK: support fill data to mergeable rx buffer in host > VSOCK: support receive mergeable rx buffer in guest > VSOCK: increase send pkt len in mergeable mode to improve performance > VSOCK: batch sending rx buffer to increase bandwidth > > drivers/vhost/vsock.c | 183 ++++++++++++++++++++----- > include/linux/virtio_vsock.h | 13 +- > include/uapi/linux/virtio_vsock.h | 5 + > net/vmw_vsock/virtio_transport.c | 229 +++++++++++++++++++++++++++----- > net/vmw_vsock/virtio_transport_common.c | 66 ++++++--- > 5 files changed, 411 insertions(+), 85 deletions(-) > > -- > 1.8.3.1