On Wed, Nov 20, 2013 at 10:20:09PM +0800, Xiao Guangrong wrote: > > But what guarantee does userspace require, from GET_DIRTY_LOG, while vcpus are > > executing? > > Aha. Single calling GET_DIRTY_LOG is useless since new dirty page can be generated > when GET_DIRTY_LOG is being returned. If user wants to get exact dirty pages the vcpus > should be stopped. > > > > > With fast page fault: > > > > if (cmpxchg64(sptep, spte, spte | PT_WRITABLE_MASK) == spte) > > /* The window in here... */ > > mark_page_dirty(vcpu->kvm, gfn); > > > > And the $SUBJECT set_spte reordering, the rule becomes > > > > A call to GET_DIRTY_LOG guarantees to return correct information about > > dirty pages before invocation of the previous GET_DIRTY_LOG call. > > > > (see example 1: the next GET_DIRTY_LOG will return the dirty information > > there). > > > > It seems no. > > The first GET_DIRTY_LOG can happen before fast-page-fault, > the second GET_DIRTY_LOG happens in the window between cmpxchg() > and mark_page_dirty(), for the second one, the information is still “incorrect”. Its correct for the previous GET_DIRTY_LOG call. > > The rule for sptes that is, because kvm_write_guest does not match the > > documentation at all. > > You mean the case of “kvm_write_guest” is valid (I do not know why it is)? > Or anything else? > > > > > So before example 1 and this patch, the rule (well for sptes at least) was > > > > "Given a memory slot, return a bitmap containing any pages dirtied > > since the last call to this ioctl. Bit 0 is the first page in the > > memory slot. Ensure the entire structure is cleared to avoid padding > > issues." > > > > Can you explain why it is OK to relax this rule? > > It’s because: > 1) it doesn’t break current use cases, i.e. Live migration and FB-flushing. > 2) the current code, like kvm_write_guest has already broken the documentation > (the guest page has been written but missed in the dirty bitmap). > 3) it’s needless to implement a exact get-dirty-pages since the dirty pages can > no be exactly got except stopping vcpus. > > So i think we'd document this case instead. No? Lets figure out the requirements, then. I don't understand why FB-flushing is safe (think kvm-autotest: one pixel off the entire test fails). Before fast page fault: Pages are either write protected or the corresponding dirty bitmap bit is set. Any write faults to dirty logged sptes while GET_DIRTY log is executing in the protected section are allowed to instantiate writeable spte after GET_DIRTY log is finished. After fast page fault: Pages can be writeable and the dirty bitmap not set. Therefore data can be dirty before GET_DIRTY executes and still fail to be returned in the bitmap. Since this patchset does not introduce change in behaviour (fast pf did), no reason to avoid merging this. BTW, since GET_DIRTY log does not require to report concurrent (to GET_DIRTY_LOG) write faults, it is not necessary to rewalk the spte list, is it? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html