On Tue, 06 Dec 2011 11:58:21 +0200, Avi Kivity <avi@xxxxxxxxxx> wrote: > On 12/06/2011 07:07 AM, Rusty Russell wrote: > > Yes, but the hypervisor/trusted party would simply have to do the copy; > > the rings themselves would be shared A would say "copy this to/from B's > > ring entry N" and you know that A can't have changed B's entry. > > Sorry, I don't follow. How can the rings be shared? If A puts a gpa in > A's address space into the ring, there's no way B can do anything with > it, it's an opaque number. Xen solves this with an extra layer of > indirection (grant table handles) that cost extra hypercalls to map or > copy. It's not symmetric. B can see the desc and avail pages R/O, and the used page R/W. It needs to ask the something to copy in/out of descriptors, though, because they're an opaque number, and it doesn't have access. ie. the existence of the descriptor in the ring *implies* a grant. Perhaps this could be generalized further into a "connect these two rings", but I'm not sure. Descriptors with both read and write parts are tricky. > > Every driver really wants to put a pointer in there. We have an array > > to map desc. numbers to cookies inside the virtio core. > > > > We really want 64 bits. > > With multiqueue, it may be cheaper to do the extra translation locally > than to ship the cookie across cores (or, more likely, it will make no > difference). Indeed. > However, moving pointers only works if you trust the other side. That > doesn't work if we manage to share a ring. Yes, that part needs to be trusted too. > > I'm just not sure how the host would even know to hint. > > For JBOD storage, a good rule of thumb is (number of spindles) x 3. > With less, you might leave an idle spindle; with more, you're just > adding latency. This assumes you're using indirects so ring entry == > request. The picture is muddier with massive battery-backed RAID > controllers or flash. > > For networking, you want (ring size) * min(expected packet size, page > size) / (link bandwidth) to be something that doesn't get the > bufferbloat people after your blood. OK, so while neither side knows, the host knows slightly more. Now I think about it, from a spec POV, saying it's a "hint" is useless, as it doesn't tell the driver what to do with it. I'll say it's a maximum, which keeps it simple. Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html