On Fri, Mar 04, 2016 at 03:13:03PM +0000, Li, Liang Z wrote: > > > Maybe I am not clear enough. > > > > > > I mean if we inflate balloon before live migration, for a 8GB guest, it takes > > about 5 Seconds for the inflating operation to finish. > > > > And these 5 seconds are spent where? > > > > The time is spent on allocating the pages and send the allocated pages pfns to QEMU > through virtio. What if we skip allocating pages but use the existing interface to send pfns to QEMU? > > > For the PV solution, there is no need to inflate balloon before live > > > migration, the only cost is to traversing the free_list to construct > > > the free pages bitmap, and it takes about 20ms for a 8GB idle guest( less if > > there is less free pages), passing the free pages info to host will take about > > extra 3ms. > > > > > > > > > Liang > > > > So now let's please stop talking about solutions at a high level and discuss the > > interface changes you make in detail. > > What makes it faster? Better host/guest interface? No need to go through > > buddy allocator within guest? Less interrupts? Something else? > > > > I assume you are familiar with the current virtio-balloon and how it works. > The new interface is very simple, send a request to the virtio-balloon driver, > The virtio-driver will travers the '&zone->free_area[order].free_list[t])' to > construct a 'free_page_bitmap', and then the driver will send the content > of 'free_page_bitmap' back to QEMU. That all the new interface does and > there are no ' alloc_page' related affairs, so it's faster. > > > Some code snippet: > ---------------------------------------------- > +static void mark_free_pages_bitmap(struct zone *zone, > + unsigned long *free_page_bitmap, unsigned long pfn_gap) { > + unsigned long pfn, flags, i; > + unsigned int order, t; > + struct list_head *curr; > + > + if (zone_is_empty(zone)) > + return; > + > + spin_lock_irqsave(&zone->lock, flags); > + > + for_each_migratetype_order(order, t) { > + list_for_each(curr, &zone->free_area[order].free_list[t]) { > + > + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > + for (i = 0; i < (1UL << order); i++) { > + if ((pfn + i) >= PFN_4G) > + set_bit_le(pfn + i - pfn_gap, > + free_page_bitmap); > + else > + set_bit_le(pfn + i, free_page_bitmap); > + } > + } > + } > + > + spin_unlock_irqrestore(&zone->lock, flags); } > ---------------------------------------------------- > Sorry for my poor English and expression, if you still can't understand, > you could glance at the patch, total about 400 lines. > > > > > > -- > > > > MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html