On 9/8/2015 11:54 PM, Xie, Huawei wrote: > On 9/8/2015 11:39 PM, Stephen Hemminger wrote: >> On Fri, 4 Sep 2015 08:25:05 +0000 >> "Xie, Huawei" <huawei.xie@xxxxxxxxx> wrote: >> >>> Hi: >>> >>> Recently I have done one virtio optimization proof of concept. The >>> optimization includes two parts: >>> 1) avail ring set with fixed descriptors >>> 2) RX vectorization >>> With the optimizations, we could have several times of performance boost >>> for purely vhost-virtio throughput. >>> >>> Here i will only cover the first part, which is the prerequisite for the >>> second part. >>> Let us first take RX for example. Currently when we fill the avail ring >>> with guest mbuf, we need >>> a) allocate one descriptor(for non sg mbuf) from free descriptors >>> b) set the idx of the desc into the entry of avail ring >>> c) set the addr/len field of the descriptor to point to guest blank mbuf >>> data area >>> >>> Those operation takes time, and especially step b results in modifed (M) >>> state of the cache line for the avail ring in the virtio processing >>> core. When vhost processes the avail ring, the cache line transfer from >>> virtio processing core to vhost processing core takes pretty much CPU >>> cycles. >>> To solve this problem, this is the arrangement of RX ring for DPDK >>> pmd(for non-mergable case). >>> >>> avail >>> idx >>> + >>> | >>> +----+----+---+-------------+------+ >>> | 0 | 1 | 2 | ... | 254 | 255 | avail ring >>> +-+--+-+--+-+-+---------+---+--+---+ >>> | | | | | | >>> | | | | | | >>> v v v | v v >>> +-+--+-+--+-+-+---------+---+--+---+ >>> | 0 | 1 | 2 | ... | 254 | 255 | desc ring >>> +----+----+---+-------------+------+ >>> | >>> | >>> +----+----+---+-------------+------+ >>> | 0 | 1 | 2 | | 254 | 255 | used ring >>> +----+----+---+-------------+------+ >>> | >>> + >>> Avail ring is initialized with fixed descriptor and is never changed, >>> i.e, the index value of the nth avail ring entry is always n, which >>> means virtio PMD is actually refilling desc ring only, without having to >>> change avail ring. >>> When vhost fetches avail ring, if not evicted, it is always in its first >>> level cache. >>> >>> When RX receives packets from used ring, we use the used->idx as the >>> desc idx. This requires that vhost processes and returns descs from >>> avail ring to used ring in order, which is true for both current dpdk >>> vhost and kernel vhost implementation. In my understanding, there is no >>> necessity for vhost net to process descriptors OOO. One case could be >>> zero copy, for example, if one descriptor doesn't meet zero copy >>> requirment, we could directly return it to used ring, earlier than the >>> descriptors in front of it. >>> To enforce this, i want to use a reserved bit to indicate in order >>> processing of descriptors. >>> >>> For tx ring, the arrangement is like below. Each transmitted mbuf needs >>> a desc for virtio_net_hdr, so actually we have only 128 free slots. >>> >>> >>> ++ || || +-----+-----+-----+--------------+------+------+------+ | 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring +--+--+--+--+-----+---+------+---+--+---+------+--+---+ | | | || | | | v v v || v v v +--+--+--+--+-----+---+------+---+--+---+------+--+---+ | 127 | 128 | ... | 255 || 127 | 128 | ... | 255 | desc ring for virtio_net_hdr +--+--+--+--+-----+---+------+---+--+---+------+--+---+ | | | || | | | v v v || v v v +--+--+--+--+-----+---+------+---+--+---+------+--+---+ | 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring for tx dat >>> >>> >> Does this still work with Linux (or BSD) guest/host. >> If you are assuming both virtio/vhost are DPDK this is never going >> to be usable. > It works with both dpdk vhost and kernel vhost implementations. > But to enforce this, we had better add a new feature bit. Hi Stephen, some update about compatibility: This optimization in theory is compliant with current kernel vhost, qemu, and dpdk vhost implementations. Today i run dpdk virtio PMD with qemu and kernel vhost, and it works fine. >> On a related note, have you looked at getting virtio to support the >> new standard (not legacy) mode? > Yes, we add it to our plan to support virtio 1.0. >> > _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization