On Fri, Feb 25, 2022, Woodhouse, David wrote: > On Wed, 2022-02-23 at 16:53 +0000, Sean Christopherson wrote: > > Don't actually set a request bit in vcpu->requests when making a request > > purely to force a vCPU to exit the guest. Logging a request but not > > actually consuming it would cause the vCPU to get stuck in an infinite > > loop during KVM_RUN because KVM would see the pending request and bail > > from VM-Enter to service the request. > > Hm, it might be that we *do* want to do some work. > > I think there's a problem with the existing kvm_host_map that we > haven't yet resolved with the new gfn_to_pfn_cache. > > Look for the calls to 'kvm_vcpu_unmap(…, true)' in e.g. vmx/nested.c > > Now, what if a vCPU is in guest mode, doesn't vmexit back to the L1, > its userspace thread takes a signal and returns to userspace. > > The pages referenced by those maps may have been written, but because > the cache is still valid, they haven't been marked as dirty in the KVM > dirty logs yet. > > So, a traditional live migration workflow once it reaches convergence > would pause the vCPUs, copy the final batch of dirty pages to the > destination, then destroy the VM on the source. > > And AFAICT those mapped pages don't actually get marked dirty until > nested_vmx_free_cpu() calls vmx_leave_nested(). Which will probably > trigger the dirty log WARN now, since there's no active vCPU context > for logging, right? > > And the latest copy of those pages never does get copied to the > destination. > > Since I didn't spot that problem until today, the pfn_to_gfn_cache > design inherited it too. The 'dirty' flag remains set in the GPC until > a subsequent revalidate or explicit unmap. > > Since we need an active vCPU context to do dirty logging (thanks, dirty > ring)... and since any time vcpu_run exits to userspace for any reason > might be the last time we ever get an active vCPU context... I think > that kind of fundamentally means that we must flush dirty state to the > log on *every* return to userspace, doesn't it? I would rather add a variant of mark_page_dirty_in_slot() that takes a vCPU, which we whould have in all cases. I see no reason to require use of kvm_get_running_vcpu().