On 01/25/2011 04:41 PM, Alex Williamson wrote:
> > > > > > kvm: Allow memory slot array to grow on demand > > > > Remove fixed KVM_MEMORY_SLOTS limit, allowing the slot array > > to grow on demand. Private slots are now allocated at the > > front instead of the end. Only x86 seems to use private slots, > > Hmm, doesn't current user space expect slots 8..11 to be the private > ones and wouldn't it cause troubles if slots 0..3 are suddenly reserved? The private slots aren't currently visible to userspace, they're actually slots 32..35. The patch automatically increments user passed slot ids so userspace has it's own zero-based view of the array. Frankly, I don't understand why userspace reserves slots 8..11, is this compatibility with older kernel implementations?
I think so. I believe these kernel versions are too old now to matter, but of course I can't be sure.
> > so this is now zero for all other archs. The memslots pointer > > is already updated using rcu, so changing the size off the > > array when it's replaces is straight forward. x86 also keeps > > a bitmap of slots used by a kvm_mmu_page, which requires a > > shadow tlb flush whenever we increase the number of slots. > > This forces the pages to be rebuilt with the new bitmap size. > > Is it possible for user space to increase the slot number to ridiculous > amounts (at least as far as kmalloc allows) and then trigger a kernel > walk through them in non-preemptible contexts? Just wondering, I haven't > checked all contexts of functions like kvm_is_visible_gfn yet. > > If yes, we should already switch to rbtree or something like that. > Otherwise that may wait a bit, but probably not too long. Yeah, Avi has brought up the hole that userspace can exploit this interface with these changes. However, for 99+% of users, this change leaves the slot array at about the same size, or makes it smaller. Only huge, scale-out guests would probably even see a doubling of slots (my guest with 14 82576 VFs uses 48 slots). On the kernel side, I think we can safely save a tree implementation as a later optimization should we determine it's necessary. We'll have to see how the userspace side matches to figure out what's best there. Thanks,
A tree would probably be a pessimization until we are able to cache the result of lookups. That's because the linear scan generates a very simple pattern of branch predictions and memory accesses, while a tree uses a whole bunch of cachelines and generates unpredictable branches (if the inputs are unpredictable).
Note that with TDP most lookups result in failure, so all we need is a fast way to determine whether to perform the lookup at all or not. That can be done by caching the last lookup for this address in the spte by setting a reserved bits. For the other lookups, which we believe will succeed, we can assume the probablity of a match is related to the slot size, and sort the slots by page count.
-- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html