>On Tue, Apr 03, 2018 at 12:29:47PM +0000, haibinzhang wrote: >> >On Tue, Apr 03, 2018 at 08:08:26AM +0000, haibinzhang wrote: >> >> handle_tx will delay rx for a long time when tx busy polling udp packets >> >> with small length(e.g. 1byte udp payload), because setting VHOST_NET_WEIGHT >> >> takes into account only sent-bytes but no single packet length. >> >> >> >> Tests were done between two Virtual Machines using netperf(UDP_STREAM, len=1), >> >> then another machine pinged the client. Result shows as follow: >> >> >> >> Packet# Ping-Latency(ms) >> >> min avg max >> >> Origin 3.319 18.489 57.503 >> >> 64 1.643 2.021 2.552 >> >> 128 1.825 2.600 3.224 >> >> 256 1.997 2.710 4.295 >> >> 512* 1.860 3.171 4.631 >> >> 1024 2.002 4.173 9.056 >> >> 2048 2.257 5.650 9.688 >> >> 4096 2.093 8.508 15.943 >> >> >> >> 512 is selected, which is multi-VRING_SIZE >> > >> >There's no guarantee vring size is 256. >> > >> >Could you pls try with a different tx ring size? >> > >> >I suspect we want: >> > >> >#define VHOST_NET_PKT_WEIGHT(vq) ((vq)->num * 2) >> > >> > >> >> and close to VHOST_NET_WEIGHT/MTU. >> > >> >Puzzled by this part. Does tweaking MTU change anything? >> >> The MTU of ethernet is 1500, so VHOST_NET_WEIGHT/MTU equals 0x80000/1500=350. > >We should include the 12 byte header so it's a bit lower. > >> Then sent-bytes cannot reach VHOST_NET_WEIGHT in one handle_tx even with 1500-bytes >> frame if packet# is less than 350. So packet# must be bigger than 350. >> 512 meets this condition > >What you seem to say is this: > > imagine MTU sized buffers. With these we stop after 350 > packets. Thus adding another limit > 350 will not > slow us down. > > Fair enough but won't apply with smaller packet > sizes, will it? Right. > > I still think a simpler argument carries more weight: > >ring size is a hint from device about a burst size >it can tolerate. Based on benchmarks, we tweak >the limit to 2 * vq size as that seems to >perform a bit better, and is still safer >than no limit on # of packets as is done now. > > but this needs testing with another ring size. > Could you try that please? Get it. We will test with another ring size and submit again > > >> and is also DEFAULT VRING_SIZE aligned. > >Neither Linux nor virtio have a default vring size. It's a historical >construct that exists in qemu for qemu compatibility >reasons. > >> > >> >> To evaluate this change, another tests were done using netperf(RR, TX) between >> >> two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz. Result as follow >> >> does not show obvious changes: >> >> >> >> TCP_RR >> >> >> >> size/sessions/+thu%/+normalize% >> >> 1/ 1/ -7%/ -2% >> >> 1/ 4/ +1%/ 0% >> >> 1/ 8/ +1%/ -2% >> >> 64/ 1/ -6%/ 0% >> >> 64/ 4/ 0%/ +2% >> >> 64/ 8/ 0%/ 0% >> >> 256/ 1/ -3%/ -4% >> >> 256/ 4/ +3%/ +4% >> >> 256/ 8/ +2%/ 0% >> >> >> >> UDP_RR >> >> >> >> size/sessions/+thu%/+normalize% >> >> 1/ 1/ -5%/ +1% >> >> 1/ 4/ +4%/ +1% >> >> 1/ 8/ -1%/ -1% >> >> 64/ 1/ -2%/ -3% >> >> 64/ 4/ -5%/ -1% >> >> 64/ 8/ 0%/ -1% >> >> 256/ 1/ +7%/ +1% >> >> 256/ 4/ +1%/ +1% >> >> 256/ 8/ +2%/ +2% >> >> >> >> TCP_STREAM >> >> >> >> size/sessions/+thu%/+normalize% >> >> 64/ 1/ 0%/ -3% >> >> 64/ 4/ +3%/ -1% >> >> 64/ 8/ +9%/ -4% >> >> 256/ 1/ +1%/ -4% >> >> 256/ 4/ -1%/ -1% >> >> 256/ 8/ +7%/ +5% >> >> 512/ 1/ +1%/ 0% >> >> 512/ 4/ +1%/ -1% >> >> 512/ 8/ +7%/ -5% >> >> 1024/ 1/ 0%/ -1% >> >> 1024/ 4/ +3%/ 0% >> >> 1024/ 8/ +8%/ +5% >> >> 2048/ 1/ +2%/ +2% >> >> 2048/ 4/ +1%/ 0% >> >> 2048/ 8/ -2%/ 0% >> >> 4096/ 1/ -2%/ 0% >> >> 4096/ 4/ +2%/ 0% >> >> 4096/ 8/ +9%/ -2% >> >> >> >> Signed-off-by: Haibin Zhang <haibinzhang@xxxxxxxxxxx> >> >> Signed-off-by: Yunfang Tai <yunfangtai@xxxxxxxxxxx> >> >> Signed-off-by: Lidong Chen <lidongchen@xxxxxxxxxxx> >> >> --- >> >> drivers/vhost/net.c | 8 +++++++- >> >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> >> >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c >> >> index 8139bc70ad7d..13a23f3f3ea4 100644 >> >> --- a/drivers/vhost/net.c >> >> +++ b/drivers/vhost/net.c >> >> @@ -44,6 +44,10 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;" >> >> * Using this limit prevents one virtqueue from starving others. */ >> >> #define VHOST_NET_WEIGHT 0x80000 >> >> >> >> +/* Max number of packets transferred before requeueing the job. >> >> + * Using this limit prevents one virtqueue from starving rx. */ >> >> +#define VHOST_NET_PKT_WEIGHT 512 >> >> + >> >> /* MAX number of TX used buffers for outstanding zerocopy */ >> >> #define VHOST_MAX_PEND 128 >> >> #define VHOST_GOODCOPY_LEN 256 >> >> @@ -473,6 +477,7 @@ static void handle_tx(struct vhost_net *net) >> >> struct socket *sock; >> >> struct vhost_net_ubuf_ref *uninitialized_var(ubufs); >> >> bool zcopy, zcopy_used; >> >> + int sent_pkts = 0; >> >> >> >> mutex_lock(&vq->mutex); >> >> sock = vq->private_data; >> >> @@ -580,7 +585,8 @@ static void handle_tx(struct vhost_net *net) >> >> else >> >> vhost_zerocopy_signal_used(net, vq); >> >> vhost_net_tx_packet(net); >> >> - if (unlikely(total_len >= VHOST_NET_WEIGHT)) { >> >> + if (unlikely(total_len >= VHOST_NET_WEIGHT) || >> >> + unlikely(++sent_pkts >= VHOST_NET_PKT_WEIGHT)) { >> >> vhost_poll_queue(&vq->poll); >> >> break; >> >> } >> >> -- >> >> 2.12.3 >> >> >>