On 04/27/2012 06:55 AM, Paul Mackerras wrote: > At present, on powerpc with Book 3S HV KVM, the kernel allocates a > fixed-size MMU hashed page table (HPT) to store the hardware PTEs for > the guest. The hash table is currently always 16MB in size, but this > is larger than necessary for small guests (i.e. those with less than > about 1GB of RAM) and too small for large guests. Furthermore, there > is no way for userspace to clear it out when resetting the guest. > > This adds a new ioctl to enable qemu to control the size of the guest > hash table, and to clear it out when resetting the guest. The > KVM_PPC_ALLOCATE_HTAB ioctl is a VM ioctl and takes as its parameter a > pointer to a u32 containing the desired order of the HPT (log base 2 > of the size in bytes), which is updated on successful return to the > actual order of the HPT which was allocated. > > There must be no vcpus running at the time of this ioctl. To enforce > this, we now keep a count of the number of vcpus running in > kvm->arch.vcpus_running. > > If the ioctl is called when a HPT has already been allocated, we don't > reallocate the HPT but just clear it out. We first clear the > kvm->arch.rma_setup_done flag, which has two effects: (a) since we hold > the kvm->lock mutex, it will prevent any vcpus from starting to run until > we're done, and (b) it means that the first vcpu to run after we're done > will re-establish the VRMA if necessary. > > If userspace doesn't call this ioctl before running the first vcpu, the > kernel will allocate a default-sized HPT at that point. We do it then > rather than when creating the VM, as the code did previously, so that > userspace has a chance to do the ioctl if it wants. > > When allocating the HPT, we can allocate either from the kernel page > allocator, or from the preallocated pool. If userspace is asking for > a different size from the preallocated HPTs, we first try to allocate > using the kernel page allocator. Then we try to allocate from the > preallocated pool, and then if that fails, we try allocating decreasing > sizes from the kernel page allocator, down to the minimum size allowed > (256kB). > > How difficult is it to have the kernel resize the HPT on demand? Guest size is meaningless in the presence of memory hotplug, and having unprivileged userspace pin down large amounts of kernel memory us undesirable. On x86 we grow and shrink the mmu resources in response to guest demand and host memory pressure. We can do this because the data structures are not authoritative (don't know it that's the case for ppc) and because they can be grown incrementally (pretty sure that isn't the case on ppc). Still, if we can do this at KVM_SET_USER_MEMORY_REGION time instead of a separate ioctl, I think it's better. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html