Virtio BoF minutes KVM Forum 2017
Attendees: Amnon Ilan, Maxime Coqueline, Vlad Yasevich, Malcolm Crossley,
David Vrabel, Ilya Lesokhin, Cunming Lian, Jens Freimann
Topics: packed ring layout with respect to hardware implementations
References:
https://lists.oasis-open.org/archives/virtio-dev/201702/msg00010.html
https://lists.oasis-open.org/archives/virtio-dev/201709/msg00013.html
Malcolm Crossley, David Vrabel:
- keep in mind not to only optimize for network with small frame sizes.
Storage has much larger sizes
- is there really no cacheline ping pong, because we are overwriting the same
cache line? 4 descs in one line, once we access two at the same time it will
cause cache coherency, messages, no?
- interesting quirk, because we flip a bit, but intel doesn't support writing
single bytes, it will always be a full dword. will that be a problem?
- interesting to look into NVME protocols, it seems to solve some of the same
problems hardware-wise
- vmware vmxnet3 has a separate data ring for when they have bigger amounts of
data. not to copy, but still interesting
Steve:
- is the _MORE flag from packed ring layout proposal still in use? what is it's meaning?
Ilya:
- you might have more completions than descriptors available
- partial descriptor chains are a problem for hardware because you might have
to read a bunch of conscriptors twice
- how would you do deal with a big buffer that cointains a large number of
small packets with respect to completions?
- is one bit for completion enough? right now it means descriptor was actually
used. how to we signal when it was completed?
- concerned about not being able to do scatter/gatter with the ring layout.
Network drivers heavily using indirect buffers.
- for a hardware implementation a completion ring is a very convenient form for
some use cases, so we want an efficient implementation for them. If we had an
inline descriptor then a completion ring is just a normal ring and we won't
need another ring type.
- doesn't like the fact that we need to do a linear scan to find the length of
a descriptor chain. It would be nice if we could have the length of the chain
in the first descriptor (i.e. the number of chained descriptors, not the number
of posted descriptors which can be deduced from the id field)
Vlad:
- there were discussions about having a bigger descriptor. then we would
have more space to put things like a vnet header into the descriptor. It would
also mean less conflicts with accessing the same cache line. (descriptors already
grew to 16 bytes, do we need more?)
- was playing around with the idea of different ring types for different devices
e.g. scsi, net. starting with generic information then comes protocol
specific data. Ilya agrees. length of descriptor would be flexibla by adding a
descriptor length field.
How to continue / TODOs:
- do benchmarking with bigger frame sizes on fast enough NICs
- turn prototype code into a RFC series (work in progress)
- more people interested to join monthly meetings
Open questions:
- Do we need an (optional) completion ring?
- Is there a situation where 4 descriptors in a cache line is a problem because
we access the same cache line, causing cache ping-pong?
- Interrupt suppression requires device to do a memory read after writing out
descriptors? Will that be too costly? Let driver write out index?
regards
Jens
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization