On Wed, Jan 04, 2017 at 08:26:27PM -0200, Marcelo Tosatti wrote: > On Wed, Jan 04, 2017 at 05:59:17PM -0200, Eduardo Habkost wrote: > > On Wed, Jan 04, 2017 at 11:39:16AM -0200, Eduardo Habkost wrote: > > > On Wed, Jan 04, 2017 at 09:56:56AM -0200, Marcelo Tosatti wrote: > > > > On Tue, Dec 27, 2016 at 05:21:20PM -0200, Eduardo Habkost wrote: > > > > > Instead of blocking migration on the source when invtsc is > > > > > enabled, rely on the migration destination to ensure there's no > > > > > TSC frequency mismatch. > > > > > > > > > > We can't allow migration unconditionally because we don't know if > > > > > the destination is a QEMU version that is really going to ensure > > > > > there's no TSC frequency mismatch. To ensure we are migrating to > > > > > a destination that won't ignore SET_TSC_KHZ errors, allow invtsc > > > > > migration only on pc-*-2.9 and newer. > > > > > > > > > > Signed-off-by: Eduardo Habkost <ehabkost@xxxxxxxxxx> > > [...] > > > > > @@ -2655,12 +2656,14 @@ int kvm_arch_put_registers(CPUState *cpu, int level) > > > > > } > > > > > > > > > > if (level == KVM_PUT_FULL_STATE) { > > > > > - /* We don't check for kvm_arch_set_tsc_khz() errors here, > > > > > - * because TSC frequency mismatch shouldn't abort migration, > > > > > - * unless the user explicitly asked for a more strict TSC > > > > > - * setting (e.g. using an explicit "tsc-freq" option). > > > > > + /* Migration TSC frequency mismatch is fatal only if we are > > > > > + * actually reporting Invariant TSC to the guest. > > > > > */ > > > > > - kvm_arch_set_tsc_khz(cpu); > > > > > + ret = kvm_arch_set_tsc_khz(cpu); > > > > > + if ((x86_cpu->env.features[FEAT_8000_0007_EDX] & CPUID_APM_INVTSC) && > > > > > + ret < 0) { > > > > > + return ret; > > > > > + } > > > > > } > > > > > > > > Will the guest continue in the source in this case? > > > > > > > > I think this is past the point where migration has been declared > > > > successful. > > > > > > > > Otherwise looks good. > > > > > > Good point. I will make additional tests and see if there's some > > > other place where the kvm_arch_set_tsc_khz() call can be moved > > > to. > > > > So, if we solve this and do something on (for example) post_load, > > we still have a problem: device state is migrated after RAM. This > > means QEMU will check for TSC scaling and abort migration very > > late. > > > > We could solve that by manually registering a SaveVMHandler that > > will send the TSC frequency on save_live_setup, so migration is > > aborted earlier. > > > > But: this sounds like just a complex hack to work around the real > > problems: > > > > 1) TSC frequency is guest-visible, and anything that affects > > guest ABI should depend on the VM configuration, not on host > > capabilities; > > Well not really: the TSC frequency where the guest starts > is the frequency the guest software expects. > So it does depend on host capabilities. Could you explain where this expectation comes from? I can see multiple cases where choosing the TSC frequency where the VM starts is not the best option. I am considering two possible scenarios below. You probably have another scenario in mind, so it would be useful if you could describe it so we can consider possible solutions. Scenario 1: You have two hosts: A and B, both with TSC scaling. They have different frequencies. The VM can be started on either one of them, and can be migrated to either one depending on the policy of management software. Maybe B is a backup host, the VM is expected to run most times on host A, and we want to use the TSC frequency from host A every time. Maybe it's the opposite and we want to use the frequency of B. Maybe we want to use the lowest frequency of both, maybe the highest one. We (QEMU developers) can recommend policy to libvirt developers, but a given QEMU process doesn't have information to decide what's best. Scenario 2: Host A has TSC scaling, host B doesn't have TSC scaling. We want to be able to start the VM on host A, and migrate to B. In this case, the only possible solution is to use B's frequency when starting the VM. The QEMU process doesn't have enough information to make that decision. > > > 2) Setting TSC frequency depending on the host will make > > migratability unpredictable for management software: the same > > VM config could be migratable to host A when started on host > > B, but not migratable to host A when started on host C. > > Well, just check the frequency. What do you mean by "just check the frequency"? What exactly should management software do? Whatever management software do, if you just use the source host frequency you can get into a situation where you can start a VM on host A and migrate it to B, but can't start the VM on host B and migrate it to A. This would be confusing for users, and probably break assumptions of existing management software. > > > I suggest we allow migration with invtsc if and only if > > tsc-frequency is set explicitly by management software. In other > > words, apply only patches 1/4 and 2/4 from this series. After > > that, we will need libvirt support for configuring tsc-frequency. > > I don't like that for the following reasons: > > * It moves low level complexity from QEMU/KVM to libvirt > (libvirt has to call KVM_GET_TSC_KHZ from a vcpu thread). It doesn't. It could ask QEMU for that information (the new query-cpu-model-expansion QMP command would allow that). Or, alternatively, it could just let the user choose the frequency. It's probably acceptable for many use cases where invtsc+migration is important. > * It makes it difficult to run QEMU manually (i use QEMU > manually all the time). I believe the set of people that: 1) need invtsc; 2) need migration to work; and 3) run QEMU manually will be able to add "tsc-frequency=XXX" to the command-line easily. :) > * It requires patching libvirt. > > In my experience things work better when the functionality is > not split across libvirt/qemu. In my experience things break when management software is the only component able to make a decision but we don't provide mechanisms to let management software make that decision. The TSC frequency affects migratability to hosts. Choose the wrong frequency, and you might not be able to migrate. Only management software knows to which hosts the VM could be migrated in the future, and which TSC frequency would allow that. > > Can't this be fixed in QEMU? Just check that destination host supports > TSC scaling before migration (or that KVM_GET_TSC_KHZ return value > matches on source and destination). > This solves only one use case: where you want to expose the starting host TSC frequency to the guest. It doesn't cover any scenario where the TSC frequency of the starting host isn't the best one. (See the two scenarios I described above) You seem to have a specific use case in mind. If you describe it, we could decide if it's worth the extra migration code complexity to deal with invtsc migration without explicit tsc-frequency. By now, I am not convinced yet. Maybe your use case just needs a explicit "invtsc-migration=on" command-line flag without a mechanism to abort migration on mismatch? I can't tell. Note that even if we follow your suggestion and implement an alternative version of patch 4/4 to cover your use case, I will strongly recommend libvirt developers to support configuring TSC frequency if they want to support invtsc + migration without surprising/unpredictable restrictions on migratability. -- Eduardo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html