On Wed, Jun 27, 2018 at 10:24:43PM +0800, Jason Wang wrote: > > > On 2018年06月26日 13:17, xiangxia.m.yue@xxxxxxxxx wrote: > > From: Tonghao Zhang <xiangxia.m.yue@xxxxxxxxx> > > > > This patch improves the guest receive performance from > > host. On the handle_tx side, we poll the sock receive > > queue at the same time. handle_rx do that in the same way. > > > > For avoiding deadlock, change the code to lock the vq one > > by one and use the VHOST_NET_VQ_XX as a subclass for > > mutex_lock_nested. With the patch, qemu can set differently > > the busyloop_timeout for rx or tx queue. > > > > We set the poll-us=100us and use the iperf3 to test > > its throughput. The iperf3 command is shown as below. > > > > on the guest: > > iperf3 -s -D > > > > on the host: > > iperf3 -c 192.168.1.100 -i 1 -P 10 -t 10 -M 1400 > > > > * With the patch: 23.1 Gbits/sec > > * Without the patch: 12.7 Gbits/sec > > > > Signed-off-by: Tonghao Zhang <zhangtonghao@xxxxxxxxxxxxxxx> > > Thanks a lot for the patch. Looks good generally, but please split this big > patch into separate ones like: > > patch 1: lock vqs one by one > patch 2: replace magic number of lock annotation > patch 3: factor out generic busy polling logic to vhost_net_busy_poll() > patch 4: add rx busy polling in tx path. > > And please cc Michael in v3. > > Thanks Pls include host CPU utilization numbers. You can get them e.g. using vmstat. I suspect we also want the polling controllable e.g. through an ioctl. -- MST _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization