Re: [PATCH vhost] vhost_net: Fix too many vring kick on busypoll

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2018年07月02日 10:52, Toshiaki Makita wrote:
On 2018/07/02 11:41, Jason Wang wrote:
On 2018年06月30日 00:38, Michael S. Tsirkin wrote:
On Fri, Jun 29, 2018 at 05:09:50PM +0900, Toshiaki Makita wrote:
Under heavy load vhost busypoll may run without suppressing
notification. For example tx zerocopy callback can push tx work while
handle_tx() is running, then busyloop exits due to vhost_has_work()
condition and enables notification but immediately reenters handle_tx()
because the pushed work was tx. In this case handle_tx() tries to
disable notification again, but when using event_idx it by design
cannot. Then busyloop will run without suppressing notification.
Another example is the case where handle_tx() tries to enable
notification but avail idx is advanced so disables it again. This case
also lead to the same situation with event_idx.

The problem is that once we enter this situation busyloop does not work
under heavy load for considerable amount of time, because notification
is likely to happen during busyloop and handle_tx() immediately enables
notification after notification happens. Specifically busyloop detects
notification by vhost_has_work() and then handle_tx() calls
vhost_enable_notify().
I'd like to understand the problem a bit better.
Why does this happen?
Doesn't this only happen if ring is empty?

My understanding is:

vhost_zerocopy_callback() try to poll vhost virtqueue. This will cause
the busy loop in vhost_net_tx_get_vq_desc() to exit because of
vhost_has_work() return true. Then handle_tx() tends to enable
notification. Then guest may kick us even if handle_tx() call
vhost_disable_notify() which in fact did nothing for even index.
Yes.

Maybe we can try to call vhost_zerocopy_signal_used() if we found
there's pending used from zerocopy instead.
Note that even when zerocopy is disabled the problem happens as I wrote.
When vhost_enable_notify() detects avail_idx advanced it tries to
disable notification again but it fails.


Yes, and the main reason is need_resched() and rx work. (polling RX will be addressed by Tonghao's patch I think).

Thanks



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux