> On 07/27/2016 03:05 PM, Michael S. Tsirkin wrote: > > On Wed, Jul 27, 2016 at 09:40:56AM -0700, Dave Hansen wrote: > >> On 07/26/2016 06:23 PM, Liang Li wrote: > >>> + for_each_migratetype_order(order, t) { > >>> + list_for_each(curr, &zone->free_area[order].free_list[t]) { > >>> + pfn = page_to_pfn(list_entry(curr, struct page, lru)); > >>> + if (pfn >= start_pfn && pfn <= end_pfn) { > >>> + page_num = 1UL << order; > >>> + if (pfn + page_num > end_pfn) > >>> + page_num = end_pfn - pfn; > >>> + bitmap_set(bitmap, pfn - start_pfn, > page_num); > >>> + } > >>> + } > >>> + } > >> > >> Nit: The 'page_num' nomenclature really confused me here. It is the > >> number of bits being set in the bitmap. Seems like calling it > >> nr_pages or num_pages would be more appropriate. > >> > >> Isn't this bitmap out of date by the time it's send up to the > >> hypervisor? Is there something that makes the inaccuracy OK here? > > > > Yes. Calling these free pages is unfortunate. It's likely to confuse > > people thinking they can just discard these pages. > > > > Hypervisor sends a request. We respond with this list of pages, and > > the guarantee hypervisor needs is that these were free sometime > > between request and response, so they are safe to free if they are > > unmodified since the request. hypervisor can detect modifications so > > it can detect modifications itself and does not need guest help. > > Ahh, that makes sense. > > So the hypervisor is trying to figure out: "Which pages do I move?". It wants > to know which pages the guest thinks have good data and need to move. > But, the list of free pages is (likely) smaller than the list of pages with good > data, so it asks for that instead. > > A write to a page means that it has valuable data, regardless of whether it > was in the free list or not. > > The hypervisor only skips moving pages that were free *and* were never > written to. So we never lose data, even if this "get free page info" > stuff is totally out of date. > > The patch description and code comments are, um, a _bit_ light for this level > of subtlety. :) I will add more description about this in v3. Thanks! Liang _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization