On Fri, Sep 21, 2012 at 11:30:31PM +0300, Dor Laor wrote: > On 09/21/2012 05:51 AM, Marcelo Tosatti wrote: > >On Fri, Sep 21, 2012 at 12:02:46AM +0300, Dor Laor wrote: > >>On 09/12/2012 06:39 PM, Marcelo Tosatti wrote: > >>> > >>> > >>>HW TSC scaling is a feature of AMD processors that allows a > >>>multiplier to be specified to the TSC frequency exposed to the guest. > >>> > >>>KVM also contains provision to trap TSC ("KVM: Infrastructure for > >>>software and hardware based TSC rate scaling" cc578287e3224d0da) > >>>or advance TSC frequency. > >>> > >>>This is useful when migrating to a host with different frequency and > >>>the guest is possibly using direct RDTSC instructions for purposes > >>>other than measuring cycles (that is, it previously calculated > >>>cycles-per-second, and uses that information which is stale after > >>>migration). > >>> > >>>"qemu-x86: Set tsc_khz in kvm when supported" (e7429073ed1a76518) > >>>added support for tsc_khz= option in QEMU. > >>> > >>>I am proposing the following changes so that management applications > >>>can work with this: > >>> > >>>1) New option for tsc_khz, which is tsc_khz=host (QEMU command line > >>>option). Host means that QEMU is responsible for retrieving the > >>>TSC frequency of the host processor and use that. > >>>Management application does not have to deal with the burden. > >>> > >>>2) New subsection with tsc_khz value. Destination host should consult > >>>supported features of running kernel and fail if feature is unsupported. > >>> > >>> > >>>It is not necessary to use this tsc_khz setting with modern guests > >>>using paravirtual clocks, or when its known that applications make > >>>proper use of the time interface provided by operating systems. > >>> > >>>On the other hand, legacy applications or setups which require no > >>>modification and correct operation while virtualized and make > >>>use of RDTSC might need this. > >>> > >>>Therefore it appears that this "tsc_khz=auto" option can be specified > >>>only if the user specifies so (it can be a per-guest flag hidden > >>>in the management configuration/manual). > >>> > >>>Sending this email to gather suggestions (or objections) > >>>to this interface. > >> > >>I'm not sure I understand the exact difference between the offers. > >>We can define these 3 options: > >> > >>1. Qemu/kvm won't make use of tsc scaling feature at all. > >>2. tsc scaling is used and we take the value either from the host or > >> from the live migration data that overrides the later for incoming. > >> As you've said, it should be passed through a sub section. > >>3. Manual setting of the value (uncommon). > >> > >>Is there another option worth considering? > >>The questions is what should be the default. IMHO #2 is more > >>appropriate to serve as a default since we do expect tsc to change > >>between hosts. > > > >Option 1. is more appropriate to serve as a default given that > >modern guests make use of paravirt, as you have observed. > > but you also observed that legacy applications that use rdtsc (even > over pv kernel) will still be affected by the physical tsc > frequency. Since I'm not aware of downside for using scaling, I > rather pick opt #2 as a default. The downside is that, if your destination host does not support tsc scaling, two possibilities arise: 1) destination tsc frequency > source tsc frequency: TSC trap 2) destination tsc frequency < source tsc frequency: TSC catchup TSC trapping is not wanted, because it is slow. This is the downside. Note Intel does not support tsc scaling. > >That is, tsc scaling is only required if the guest does direct RDTSC > >on the expectation that the value won't change. > > > >>Cheers, > >>Dor > >-- > >To unsubscribe from this list: send the line "unsubscribe kvm" in > >the body of a message to majordomo@xxxxxxxxxxxxxxx > >More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html