> On Thu, Jul 28, 2016 at 06:36:18AM +0000, Li, Liang Z wrote: > > > > > This ends up doing a 1MB kmalloc() right? That seems a _bit_ big. > > > > > How big was the pfn buffer before? > > > > > > > > Yes, it is if the max pfn is more than 32GB. > > > > The size of the pfn buffer use before is 256*4 = 1024 Bytes, it's > > > > too small, and it's the main reason for bad performance. > > > > Use the max 1MB kmalloc is a balance between performance and > > > > flexibility, a large page bitmap covers the range of all the > > > > memory is no good for a system with huge amount of memory. If the > > > > bitmap is too small, it means we have to traverse a long list for > > > > many times, and it's bad > > > for performance. > > > > > > > > Thanks! > > > > Liang > > > > > > There are all your implementation decisions though. > > > > > > If guest memory is so fragmented that you only have order 0 4k > > > pages, then allocating a huge 1M contigious chunk is very problematic in > and of itself. > > > > > > > The memory is allocated in the probe stage. This will not happen if > > the driver is loaded when booting the guest. > > > > > Most people rarely migrate and do not care how fast that happens. > > > Wasting a large chunk of memory (and it's zeroed for no good reason, > > > so you actually request host memory for it) for everyone to speed it > > > up when it does happen is not really an option. > > > > > If people don't plan to do inflating/deflating, they should not enable > > the virtio-balloon at the beginning, once they decide to use it, the > > driver should provide better performance as much as possible. > > The reason people inflate/deflate is so they can overcommit memory. > Do they need to overcommit very quickly? I don't see why. > So let's get what we can for free but I don't really believe people would want > to pay for it. > > > 1MB is a very small portion for a VM with more than 32GB memory and > > it's the *worst case*, for VM with less than 32GB memory, the amount > > of RAM depends on VM's memory size and will be less than 1MB. > > It's guest memmory so might all be in swap and never touched, your memset > at probe time will fault it in and make hypervisor actually pay for it. > > > If 1MB is too big, how about 512K, or 256K? 32K seems too small. > > > > Liang > > It's only small because it makes you rescan the free list. > So maybe you should do something else. > I looked at it a bit. Instead of scanning the free list, how about scanning actual > page structures? If page is unused, pass it to host. > Solves the problem of rescanning multiple times, does it not? > Yes, agree. > > Another idea: allocate a small bitmap at probe time (e.g. for deflate), allocate > a bunch more on each request. Use something like GFP_ATOMIC and a > scatter/gather, if that fails use the smaller bitmap. > So, the aim of v3 is to use a smaller bitmap without too heavy performance penalty. Thanks a lot! Liang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html