On Mon, Oct 26, 2015 at 10:09:13AM +0800, haozhong.zhang@xxxxxxxxx wrote: > On Fri, Oct 23, 2015 at 12:45:13PM -0200, Eduardo Habkost wrote: > > On Fri, Oct 23, 2015 at 10:27:27AM +0800, Haozhong Zhang wrote: > > > On Thu, Oct 22, 2015 at 04:45:21PM -0200, Eduardo Habkost wrote: > > > > On Tue, Oct 20, 2015 at 03:22:51PM +0800, Haozhong Zhang wrote: > > > > > This patchset enables QEMU to save/restore vcpu's TSC rate during the > > > > > migration. When cooperating with KVM which supports TSC scaling, guest > > > > > programs can observe a consistent guest TSC rate even though they are > > > > > migrated among machines with different host TSC rates. > > > > > > > > > > A pair of cpu options 'save-tsc-freq' and 'load-tsc-freq' are added to > > > > > control the migration of vcpu's TSC rate. > > > > > > > > The requirements and goals aren't clear to me. I see two possible use > > > > cases, here: > > > > > > > > 1) Best effort to keep TSC frequency constant if possible (but not > > > > aborting migration if not possible). This would be an interesting > > > > default, but a bit unpredictable. > > > > 2) Strictly ensuring TSC frequency stays constant on migration (and > > > > aborting migration if not possible). This would be an useful feature, > > > > but can't be enabled by default unless both hosts have the same TSC > > > > frequency or support TSC scaling. > > > > > > > > Which one(s) you are trying to implement? > > > > > > > > > > The former. I agree that it's unpredictable if setting vcpu's TSC > > > frequency to the migrated value is enabled by default (but not in this > > > patchset). The cpu option 'load-tsc-freq' is introduced to allow users > > > to enable this behavior if they do know the underlying KVM and CPU > > > support TSC scaling. In this way, I think the behavior is predictable > > > as users do know what they are doing. > > > > I'm confused. If load-tsc-freq doesn't abort when TSC scaling isn't > > available (use case #1), why isn't it enabled by default? On the other > > hand, if you expect the user to enable it only if the host supports TSC > > scaling, why doesn't it abort if TSC scaling isn't available? > > > > Sorry for the confusion. For user case #1, load-tsc-freq is really not > needed and the migrated TSC frequency should be set if possible > (ie. if TSC scaling is supported and KVM_SET_TSC_KHZ succeeds). If > setting TSC frequency fails, the migration will not be aborted. Agreed. > > > I mean, we can implement both use cases above this way: > > > > 1) If the user didn't ask for anything explicitly: > > * If the tsc-freq value is available in the migration stream, try to > > set it (but don't abort if it can't be set). (use case #1 above) > > * Rationale: it won't hurt to try to make the VM behave nicely if > > possible, without blocking migration if TSC scaling isn't > > available. > > 2) If the user asked for the TSC frequency to be enforced, set it and > > abort if it couldn't be set (use case #2 above). This could apply to > > both cases: > > 2.1) If tsc-freq is explicitly set in the command-line. > > * Rationale: if the user asked for a specific frequency, we > > should do what was requested and not ignore errors silently. > > 2.2) If tsc-freq is available in the migration stream, and the > > user asked explicitly for it to be enforced. > > * Rationale: the user is telling us that the incoming tsc-freq > > is important, so we shouldn't ignore it silently. > > * Open question: how should we name the new option? > > "load-tsc-freq" would be misleading because it won't be just about > > _loading_ tsc-freq (we would be loading it on use case #1, too), > > but about making sure it is enforced. "strict-tsc-freq"? > > "enforce-tsc-freq"? > > > > We don't need to implement both #1 and #2 at the same time. But if you > > just want to implement #1 first, I don't see the need for the > > "load-tsc-freq" option. > > > > On the migration source, we need another option or internal machine flag > > for #1. I am not sure it should be an user-visible option. If > > user-visible, I don't know how to name it. "save-tsc-freq" describes it > > correctly, but it doesn't make its purpose very clear. Any suggestions? > > It can also be implemented first as an internal machine class flag (set > > in pc >= 2.5 only), and possibly become a user-visible option later. > > > > Because the way I implements 'save-tsc-freq' in patch 1, it's exposed > to users. I'm not familiar the way to make a feature only available > for newer machine types. Could you provide some suggestions to hide > 'save-tsc-freq' from users? You can make it an internal flag if you make it a PCMachineClass field, set it to true on pc_machine_class_init(), and set it to false on pc_*_2_4_machine_options(). I am unsure about the user-visible option. I am inclined towards making it internal first because once we make it user-visible there's no turning back. > > For the name, if we make the option internal only, could we still use > 'save-tsc-freq' as it does mean saving the TSC frequency. save_tsc_freq sounds perfect for an internal flag, yes. > > > > > > > > In other words, what is the right behavior when KVM_SET_TSC_KHZ fails or > > > > KVM_CAP_TSC_CONTROL is not available? We can't answer that question if > > > > the requirements and goals are not clear. > > > > > > > > > > If KVM_CAP_TSC_CONTROL is unavailable, QEMU and KVM will use the host > > > TSC frequency as vcpu's TSC frequency. > > > > > > If KVM_CAP_TSC_CONTROL is available and KVM_SET_TSC_KHZ fails, the > > > setting of TSC frequency will fail and abort either the VM creation > > > (this is the case for cpu option 'tsc-freq') or the migration. > > > > I don't see why the lack of KVM_CAP_TSC_CONTROL and failure of > > KVM_SET_TSC_KHZ should be treated differently. In both cases it means we > > the TSC frequency can't be set. > > > > I mean: if KVM_SET_TSC_KHZ is important enough for the user to make QEMU > > abort, it should abort if KVM_CAP_TSC_CONTROL isn't even available. On > > the other hand, if the user doesn't care about the lack of > > KVM_CAP_TSC_CONTROL (meaning it isn't possible to call KVM_SET_TSC_KHZ > > at all), I don't see why they would care if KVM_SET_TSC_KHZ failed. > > > > Yes, we should treat them equally. If either of KVM_CAP_TSC_CONTROL or > KVM_SET_TSC_KHZ fails, we should display a warning message (but not > fail the migration). Agreed. I mean, except when tsc-freq is explicitly set in the command-line, in which case I believe we should abort (but we can do that later, because that would be a change from the current behavior). > > > > > > > > > > Once we know what exactly is the goal, we could enable the new mode with > > > > a single option, instead of raw options to control migration stream > > > > loading/saving. > > > > > > > > > > Saving vcpu's TSC frequency does not depend on TSC scaling but the > > > loading does. And that is why I introduce two cpu options to control > > > them separately. > > > > I understand we probably need internal flags on the source and > > destination side to control saving/loading/setting the tsc-freq. But I > > am still trying to clarify what are the configuration semantics we need, > > exactly, to properly evaluate both the new configuration interface and > > its implementation. > > > > As my above replies, we only need an internal flag 'save-tsc-freq' to > indicate whether it needs to save the TSC frequency to the migration > stream. While on the migration destination side, we set to the > migrated TSC frequency only if both TSC scaling is supported and > KVM_SET_TSC_KHZ succeeds; in other cases, we do not set TSC frequency > and not abort the migration. Agreed. -- Eduardo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html