On 2018/12/12 23:09, Michael S. Tsirkin wrote: > On Wed, Dec 12, 2018 at 05:25:50PM +0800, jiangyiwen wrote: >> Now vsock only support send/receive small packet, it can't achieve >> high performance. As previous discussed with Jason Wang, I revisit the >> idea of vhost-net about mergeable rx buffer and implement the mergeable >> rx buffer in vhost-vsock, it can allow big packet to be scattered in >> into different buffers and improve performance obviously. >> >> This series of patches mainly did three things: >> - mergeable buffer implementation >> - increase the max send pkt size >> - add used and signal guest in a batch >> >> And I write a tool to test the vhost-vsock performance, mainly send big >> packet(64K) included guest->Host and Host->Guest. I test performance >> independently and the result as follows: >> >> Before performance: >> Single socket Multiple sockets(Max Bandwidth) >> Guest->Host ~400MB/s ~480MB/s >> Host->Guest ~1450MB/s ~1600MB/s >> >> After performance only use implement mergeable rx buffer: >> Single socket Multiple sockets(Max Bandwidth) >> Guest->Host ~400MB/s ~480MB/s >> Host->Guest ~1280MB/s ~1350MB/s >> >> In this case, max send pkt size is still limited to 4K, so Host->Guest >> performance will worse than before. > > It's concerning though, what if application sends small packets? > What is the source of the slowdown? Do you know? > Hi Michael, Before performance is tested by me one month ago, I don't retest this time, this result can have some fluctuations, today I will retest all of cases included small and big packets, and try to find out the slowdown reason. Thanks, Yiwen. >> After performance increase the max send pkt size to 64K: >> Single socket Multiple sockets(Max Bandwidth) >> Guest->Host ~1700MB/s ~2900MB/s >> Host->Guest ~1500MB/s ~2440MB/s >> >> After performance all patches are used: >> Single socket Multiple sockets(Max Bandwidth) >> Guest->Host ~1700MB/s ~2900MB/s >> Host->Guest ~1700MB/s ~2900MB/s >> >> >From the test results, the performance is improved obviously, and guest >> memory will not be wasted. >> >> In addition, in order to support mergeable rx buffer in virtio-vsock, >> we need to add a qemu patch to support parse feature. >> >> --- >> v1 -> v2: >> * Addressed comments from Jason Wang. >> * Add performance test result independently. >> * Use Skb_page_frag_refill() which can use high order page and reduce >> the stress of page allocator. >> * Still use fixed size(PAGE_SIZE) to fill rx buffer, because too small >> size can't fill one full packet, we only 128 vq num now. >> * Use iovec to replace buf in struct virtio_vsock_pkt, keep tx and rx >> consistency. >> * Add virtio_transport ops to get max pkt len, in order to be compatible >> with old version. >> --- >> >> Yiwen Jiang (5): >> VSOCK: support fill mergeable rx buffer in guest >> VSOCK: support fill data to mergeable rx buffer in host >> VSOCK: support receive mergeable rx buffer in guest >> VSOCK: increase send pkt len in mergeable mode to improve performance >> VSOCK: batch sending rx buffer to increase bandwidth >> >> drivers/vhost/vsock.c | 183 ++++++++++++++++++++----- >> include/linux/virtio_vsock.h | 13 +- >> include/uapi/linux/virtio_vsock.h | 5 + >> net/vmw_vsock/virtio_transport.c | 229 +++++++++++++++++++++++++++----- >> net/vmw_vsock/virtio_transport_common.c | 66 ++++++--- >> 5 files changed, 411 insertions(+), 85 deletions(-) >> >> -- >> 1.8.3.1 > > . >