On Thu, Jan 12, 2012 at 08:13:42AM +1100, Benjamin Herrenschmidt wrote: > On Wed, 2012-01-11 at 12:21 +0200, Michael S. Tsirkin wrote: > > > > > BenH also convinced me we should finally make the config space LE if > > > we're going to change things. Since PCI is the most common transport, > > > guest-endian confuses people. And it sucks for really weird machines. > > > > Are we going to keep guest endian for e.g. virtio net header? > > If yes the benefit of switching config space is not that big. > > And changes in devices would affect non-PCI transports. > > I think the concept of "guest endian" is broken by design. What does > that mean when running for example an ARM or a ppc 440 "guest" which > could be either endian ? Since you can't hard code your guest endian how > do you obtain/negociate it ? Also you now have to deal with dual endian > in the host, makes everything trickier. > > Just make everything LE. Yea. But it's not a pure transport issue, just fixing configuration won't be enough. E.g. we have structures like virtio net header. > > Quite possibly all or some of these things help performance > > but do we have to change the spec before we have experimental > > proof? > > Well, I would argue that the network driver world has proven countless > times that those are good ideas :-) Below you seem to suggest that separate rings like virtio has now is better than a single ring like Rusty suggested. > But by all mean, let's do a > prototype implementation with virtio-net for example and bench it. > > I don't think you need a single ring. For multiqueue net, you definitely > want multiple rings and you do want rings to remain uni-directional. > > One other thing that can be useful is to separate the completion ring > from the actual ring of DMA descriptors, making the former completely > read-only by the guest and the later completely read only by the host. Are you familiar with current virtio ring structure? How is this different? > For example take the ehea ethernet rx model. It has 3 rx "rings" per > queue. One contains the completions, it's a toggle-valid model so we > never write back to clear valid, it contains infos from the parser, the > tokenID of the packet and the index as to where in which ring the data > is, wich is either inline in the completion ring (small packet), header > inline & data in a data ring or completely in a data ring. Then you have > two data rings which are simply rings of SG list entries (more or less). > > We typically pre-populate the data rings with skb's for 1500 and 9000 > bytes packets. Small packets come in immediately in the completion ring, > and large packets via the data ring. Won't real workloads suffer from packet reordering? > That's just -an- example. There's many other to take inspiration from. > Network folks have beaten to death the problem of ring efficiency vs. > CPU caches. > > > > Moreover, I think we should make all these changes at once (at least, in > > > the spec). That makes it a big change, and it'll take longer to > > > develop, but makes it easy in the long run to differentiate legacy and > > > modern virtio. > > > > > > Thoughts? > > > Rusty. > > > > BTW this seems to be the reverse from what you have in Mar 2001, > > see 87mxkjls61.fsf@xxxxxxxxxxxxxxx :) > > That was 10 years ago... Sorry, typo. It was Mar 2010 :) > > I am much less concerned with what we do for configuration, > > but I do not believe we have learned all performance lessons > > from virtio ring1. > > Maybe we have learned some more since then ? :-) There was 1 change in ring layout. > > Is there any reason why we shouldn't be > > able to experiment with inline within virtio1 and see > > whether that gets us anything? > > If we do a bunch of changes to the ring at once, we can't > > figure out what's right, what's wrong, or back out of > > mistakes later. > > > > Since there are non PCI transports that use the ring, > > we really shouldn't make both the configuration and > > the ring changes depend on the same feature bit. > > Another advantage of inline data is that it makes things a lot easier > for cases where only small amount of data need to be exchanged, such as > control/status rings, maybe virtio-tty (which I'm working on), etc... > > Cheers, > Ben. Is that getting you a lot of speedup? Note you want to add more code on data path for everyone. Why can't you have a fixed buffer in memory and just point to that? -- MST _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization