On Wed, Feb 08, 2017 at 06:41:40PM +0100, Paolo Bonzini wrote: > > > On 08/02/2017 04:20, Michael S. Tsirkin wrote: > > * Scatter/gather support > > > > We can use 1 bit to chain s/g entries in a request, same as virtio 1.0: > > > > /* This marks a buffer as continuing via the next field. */ > > #define VRING_DESC_F_NEXT 1 > > > > Unlike virtio 1.0, all descriptors must have distinct ID values. > > > > Also unlike virtio 1.0, use of this flag will be an optional feature > > (e.g. VIRTIO_F_DESC_NEXT) so both devices and drivers can opt out of it. > > I would still prefer that we had _either_ single-direct or > multiple-indirect descriptors, i.e. no VRING_DESC_F_NEXT. I can propose > my idea for this in a separate message. All it costs us spec-wise is a single bit :) The cost of indirect is an extra cache miss. We couldn't decide what's better for everyone in 1.0 days and I doubt we'll be able to now, but yes, benchmarking is needed to make sire it's required. Very easy to remove or not to use/support in drivers/devices though. > > * Batching descriptors: > > > > virtio 1.0 allows passing a batch of descriptors in both directions, by > > incrementing the used/avail index by values > 1. We can support this by > > chaining a list of descriptors through a bit the flags field. > > To allow use together with s/g, a different bit will be used. > > > > #define VRING_DESC_F_BATCH_NEXT 0x0010 > > > > Batching works for both driver and device descriptors. > > I'm still not sure how this would be useful. So this is used at least by virtio-net mergeable buffers to combine many buffers into a single packet. Similarly, on transmit linux sometimes supplies packets in batches (XMIT_MORE flag) if the other side processes them it seems nice to tell it: there's more to come soon, if you see this it is wise to poll now. That's why I kind of felt it's better as a standard bit. > It cannot be mandatory to > set the bit, I think, because you don't know when the host/guest is > going to read descriptors. So both host and guest always have to look > ahead one element in any case. Right but the point is what to do if you find nothing there? If you saw VRING_DESC_F_BATCH_NEXT it's a hint that you should poll, there's more to come soon. > > * Non power-of-2 ring sizes > > > > As the ring simply wraps around, there's no reason to > > require ring size to be power of two. > > It can be made a separate feature though. > > Power of 2 ring sizes are required in order to ignore the high bits of > the indices. With non-power-of-2 sizes you are forced to keep the > indices less than the ring size. Right. So if (unlikely(idx++ > size)) idx = 0; OTOH ring size that's twice larger than necessary because of power of two requirements wastes cache. > Alternatively you can do this: > > > * Event index would be in the range 0 to 2 * Queue Size > > (to detect wrap arounds) and wrap to 0 after that. > > > > The assumption is that each side maintains an internal > > descriptor counter 0 to 2 * Queue Size that wraps to 0. > > In that case, interrupt triggers when counter reaches > > the given value. > > but it seems more complicated than just forcing power-of-2 and ignoring > the high bits. > > Thanks, > > Paolo Absolutely power of 2 lets you save a branch. At this stage I'm just recording all the ideas and then as a next step we can micro-benchmark prototypes and compare. -- MST _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization