On 09/11/2016 17:28, Dr. David Alan Gilbert wrote: > * Paolo Bonzini (pbonzini@xxxxxxxxxx) wrote: >> >> >> On 08/11/2016 11:22, Dr. David Alan Gilbert wrote: >>> * Marcelo Tosatti (mtosatti@xxxxxxxxxx) wrote: >>>> On Mon, Nov 07, 2016 at 08:03:50PM +0000, Dr. David Alan Gilbert wrote: >>>>> * Marcelo Tosatti (mtosatti@xxxxxxxxxx) wrote: >>>>>> On Mon, Nov 07, 2016 at 03:46:11PM +0000, Dr. David Alan Gilbert wrote: >>>>>>> * Marcelo Tosatti (mtosatti@xxxxxxxxxx) wrote: >>>>>>>> This patch, relative to pre-copy migration codepath, >>>>>>>> measures the time between vm_stop() and pre_save(), >>>>>>>> which includes copying the remaining RAM to destination, >>>>>>>> and advances the clock by that amount. >>>>>>>> >>>>>>>> In a VM with 5 seconds downtime, this reduces the guest >>>>>>>> clock difference on destination from 5s to 0.2s. >>>>>>>> >>>>>>>> Tested with Linux and Windows 2012 R2 guests with -cpu XXX,+hv-time. >>>>>>> >>>>>>> One thing that bothers me is that it's only this clock that's >>>>>>> getting corrected; doesn't it cause things to get upset when >>>>>>> one clock moves and the others dont? >>>>>> >>>>>> If you are correlating the clocks, then yes. >>>>>> >>>>>> Older Linux guests get upset (marking the TSC clocksource unstable >>>>>> because the watchdog checks TSC vs kvmclock), but there is a workaround for it >>>>>> in newer guests >>>>>> (kvmclock interface to notify watchdog to not complain). >>>>>> >>>>>> Note marking TSC clocksource unstable on older guests is harmless >>>>>> because kvmclock is the standard clocksource. >>>>>> >>>>>> For Windows guests, i don't know that Windows correlates between different >>>>>> clocks. >>>>>> >>>>>> That is, there is relative control as to which software reads kvmclock >>>>>> or Windows TIMER MSR, so i don't see the need to advance every clock >>>>>> exposed. >>>>>> >>>>>>> Shouldn't the pause delay be recorded somewhere architecturally >>>>>>> independent and then be a thing that kvm-clock happens to use and >>>>>>> other clocks might as well? >>>>>> >>>>>> In theory, yes. In practice, i don't see the need for this... >>>>> >>>>> It seems unlikely to me that x86 is the only one that will want >>>>> to do something similar. >>>> >>>> Can't they copy what kvmclock is doing today? >>> >>> We shouldn't have copies of code all over should we? >> >> Let's cross the bridge when we get there. > > That will mean it has the migration data in the wrong place > and any other clocks that need to be incremented by the same offset > will need a hook or be inconsistent with this calculation. No, there is no additional migration data that is needed. This is just a bug in how the pausing of CLOCK_MONOTONIC was implemented for the kvmclock clocksource. Right now, x86 is the only case where we have the problem, and x86 is using a single "backend" for both kvmclock and the Hyper-V TSC reference page. For everyone else, there is no clocksource paravirtualization going on (luckily, considering what a mess is kvmclock). They can just use QEMU_CLOCK_VIRTUAL if they want something that pauses during the VM. Now, QEMU_CLOCK_VIRTUAL actually has the same bug that Marcelo is fixing, so we may indeed want a common solution if possible. But again, let's see first what the code looks like for _one_ clocksource, before writing a generalized (and thus more complex) solution. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html