On Wed, Dec 07, 2016 at 11:54:34AM -0800, Dave Hansen wrote: > We're talking about a bunch of different stuff which is all being > conflated. There are 3 issues here that I can see. I'll attempt to > summarize what I think is going on: > > 1. Current patches do a hypercall for each order in the allocator. > This is inefficient, but independent from the underlying data > structure in the ABI, unless bitmaps are in play, which they aren't. > 2. Should we have bitmaps in the ABI, even if they are not in use by the > guest implementation today? Andrea says they have zero benefits > over a pfn/len scheme. Dave doesn't think they have zero benefits > but isn't that attached to them. QEMU's handling gets more > complicated when using a bitmap. > 3. Should the ABI contain records each with a pfn/len pair or a > pfn/order pair? > 3a. 'len' is more flexible, but will always be a power-of-two anyway > for high-order pages (the common case) Len wouldn't be a power of two practically only if we detect adjacent pages of smaller order that may merge into larger orders we already allocated (or the other way around). [addr=2M, len=2M] allocated at order 9 pass [addr=4M, len=1M] allocated at order 8 pass -> merge as [addr=2M, len=3M] Not sure if it would be worth it, but that unless we do this, page-order or len won't make much difference. > 3b. if we decide not to have a bitmap, then we basically have plenty > of space for 'len' and should just do it > 3c. It's easiest for the hypervisor to turn pfn/len into the > madvise() calls that it needs. > > Did I miss anything? I think you summarized fine all my arguments in your summary. > FWIW, I don't feel that strongly about the bitmap. Li had one > originally, but I think the code thus far has demonstrated a huge > benefit without even having a bitmap. > > I've got no objections to ripping the bitmap out of the ABI. I think we need to see a statistic showing the number of bits set in each bitmap in average, after some uptime and lru churn, like running stresstest app for a while with I/O and then inflate the balloon and count: 1) how many bits were set vs total number of bits used in bitmaps 2) how many times bitmaps were used vs bitmap_len = 0 case of single page My guess would be like very low percentage for both points. > Surely we can think of a few ways... > > A bitmap is 64x more dense if the lists are unordered. It means being > able to store ~32k*2M=64G worth of 2M pages in one data page vs. ~1G. > That's 64x fewer cachelines to touch, 64x fewer pages to move to the > hypervisor and lets us allocate 1/64th the memory. Given a maximum > allocation that we're allowed, it lets us do 64x more per-pass. > > Now, are those benefits worth it? Maybe not, but let's not pretend they > don't exist. ;) In the best case there are benefits obviously, the question is how common the best case is. The best case if I understand correctly is all high order not available, but plenty of order 0 pages available at phys address X, X+8k, X+16k, X+(8k*nr_bits_in_bitmap). How common is that 0 pages exist but they're not at an address < X or > X+(8k*nr_bits_in_bitmap)? > Yes, the current code sends one batch of pages up to the hypervisor per > order. But, this has nothing to do with the underlying data structure, > or the choice to have an order vs. len in the ABI. > > What you describe here is obviously more efficient. And it isn't possible with the current ABI. So there is a connection with the MAX_ORDER..0 allocation loop and the ABI change, but I agree any of the ABI proposed would still allow for it this logic to be used. Bitmap or not bitmap, the loop would still work. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>