Michael S. Tsirkin wrote: > On Tue, Jun 23, 2009 at 11:21:53AM -0400, Gregory Haskins wrote: > >> Michael S. Tsirkin wrote: >> >>> Remove in_range from kvm_io_device and ask read/write callbacks, if >>> supplied, to perform range checks internally. This allows aliasing >>> (mostly for in-kernel virtio), as well as better error handling by >>> making it possible to pass errors up to userspace. And it's enough to >>> look at the diffstat to see that it's a better API anyway. >>> >>> While we are at it, document locking rules for kvm_io_device. >>> >>> >> Sorry, not trying to be a PITA, but I liked your last suggestion better. :( >> >> I am thinking forward to when we want to use something smarter than a >> linear search (like rbtree/radix) for scaling the number of "devices" >> (really, virtio-rings) that we support. >> > > in_range is broken for this anyway: you need more than a boolean > predicate to implement rbtree/radix > Yes, understood..in_range() needs to be (pardon the pun) "addressed" ;). But getting rid of in_range() and moving the match logic into the read()/write() verbs is potentially a step in the wrong direction if we ever wanted to go that route. And I'm pretty sure we do. > >> The current device-count >> target is 512, which we will begin to rapidly consume as the in-kernel >> virtio work progresses. >> > > That's a large number. I had in mind more like 4 virtio devices, for > starters: 1 for each virtqueue in net and block. > Thats way to low. For instance, I'll be wanting to do things like 802.1p which would be 16 virtio-rings per device (8 prio levels tx, 8 levels rx). And thats just for one device. I think Avi came up with an estimate of supporting 20 devices @ 16 queues = 320, so we rounded it to 512. > >> This proposed approach forces us into a >> potential O(256) algorithm in the hotpath (all MMIO/PIO exits will hit >> this, not just in-kernel users). How would you address this? >> > > Two ideas that come to mind: > - add addr/len fields to devices, use these to speed up lookup > Yep, thats what I was thinking as well. We can have the top-level (group) be an rbtree on addr/len, and then walk the list of items at that address linearly using your read/write() approach. > - add a small cache that can be scanned first > Yep, I think we may want to do this anyway independent of the search alg. > In both cases, you first do a fast lookup, ask the device whether > it wants the transaction, then resort to linear scan if not > -Greg
Attachment:
signature.asc
Description: OpenPGP digital signature