Re: [PATCH vhost] vhost_net: Fix too many vring kick on busypoll

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2018年07月02日 16:05, Toshiaki Makita wrote:
On 2018/07/02 16:52, Jason Wang wrote:
On 2018年07月02日 15:11, Toshiaki Makita wrote:
On 2018/07/02 15:17, Jason Wang wrote:
On 2018年07月02日 12:37, Toshiaki Makita wrote:
On 2018/07/02 11:54, Jason Wang wrote:
On 2018年07月02日 10:45, Toshiaki Makita wrote:
Hi Jason,

On 2018/06/29 18:30, Jason Wang wrote:
On 2018年06月29日 16:09, Toshiaki Makita wrote:
...
To fix this, poll the work instead of enabling notification when
busypoll is interrupted by something. IMHO signal_pending() and
vhost_has_work() are kind of interruptions rather than signals to
completely cancel the busypoll, so let's run busypoll after the
necessary work is done. In order to avoid too long busyloop due to
interruption, save the endtime in vq field and use it when
reentering
the work function.
I think we don't care long busyloop unless e.g tx can starve rx?
I just want to keep it user-controllable. Unless memorizing it
busypoll
can run unexpectedly long.
I think the total amount of time for busy polling is bounded. If I was
wrong, it should be a bug somewhere.
Consider this kind of scenario:
0. Set 100us busypoll for example.
1. handle_tx() runs busypoll.
2. Something like zerocopy queues tx_work within 100us.
3. busypoll exits and call handle_tx() again.
4. Repeat 1-3.

In this case handle_tx() does not process packets but busypoll
essentially runs beyond 100us without endtime memorized. This may be
just a theoretical problem, but I was worried that more code to poll tx
queue can be added in the future and it becomes realistic.
Yes, but consider zerocopy tends to batch 16 used packets and we will
finally finish all processing of packets. The above won't be endless, so
it was probably tolerable.
Right. So endtime memorization is more like a future-proof thing.
Would you like to keep it or change something?
I think we'd better introduce it only when we meet real bugs.
I'll change it to a flag to indicate the previous busypoll is interrupted.

Performance numbers:

- Bulk transfer from guest to external physical server.
         [Guest]->vhost_net->tap--(XDP_REDIRECT)-->i40e --(wire)-->
[Server]
Just to confirm in this case since zerocopy is enabled, we are in
fact
use the generic XDP datapath?
For some reason zerocopy was not applied for most packets, so in most
cases driver XDP was used. I was going to dig into it but not yet.
Right, just to confirm this. This is expected.

In tuntap, we do native XDP only for small and non zerocopy
packets. See
tun_can_build_skb(). The reason is XDP may adjust packet header
which is
not supported by zercopy. We can only use XDP generic for zerocopy in
this case.
I think I understand when driver XDP can be used. What I'm not sure and
was going to narrow down is why zerocopy is mostly not applied.

I see, any touch to the zerocopy packet (clone, header expansion or
segmentation) that lead a userspace copy will increase the error counter
in vhost_net. Then vhost_net_tx_select_zcopy() may choose not to use
zerocopy. So it was probably something in your setup or a bug somewhere.
Thanks for the hint!
Seems zerocopy packets are always nonlinear and
netif_receive_generic_xdp() calls skb_linearize() in which
__pskb_pull_tail() calls skb_zcopy_clear(). Looks like tx_zcopy_err is
always counted when zerocopy is used with XDP in my env.


Right, since it needs do copy there, so we tend to disable zerocopy in this case.

Thanks




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux