Re: [libvirt] TSC scaling interface to management

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/23/2012 04:06 AM, Marcelo Tosatti wrote:
On Fri, Sep 21, 2012 at 11:30:31PM +0300, Dor Laor wrote:
On 09/21/2012 05:51 AM, Marcelo Tosatti wrote:
On Fri, Sep 21, 2012 at 12:02:46AM +0300, Dor Laor wrote:
On 09/12/2012 06:39 PM, Marcelo Tosatti wrote:


HW TSC scaling is a feature of AMD processors that allows a
multiplier to be specified to the TSC frequency exposed to the guest.

KVM also contains provision to trap TSC ("KVM: Infrastructure for
software and hardware based TSC rate scaling" cc578287e3224d0da)
or advance TSC frequency.

This is useful when migrating to a host with different frequency and
the guest is possibly using direct RDTSC instructions for purposes
other than measuring cycles (that is, it previously calculated
cycles-per-second, and uses that information which is stale after
migration).

"qemu-x86: Set tsc_khz in kvm when supported" (e7429073ed1a76518)
added support for tsc_khz= option in QEMU.

I am proposing the following changes so that management applications
can work with this:

1) New option for tsc_khz, which is tsc_khz=host (QEMU command line
option). Host means that QEMU is responsible for retrieving the
TSC frequency of the host processor and use that.
Management application does not have to deal with the burden.

2) New subsection with tsc_khz value. Destination host should consult
supported features of running kernel and fail if feature is unsupported.


It is not necessary to use this tsc_khz setting with modern guests
using paravirtual clocks, or when its known that applications make
proper use of the time interface provided by operating systems.

On the other hand, legacy applications or setups which require no
modification and correct operation while virtualized and make
use of RDTSC might need this.

Therefore it appears that this "tsc_khz=auto" option can be specified
only if the user specifies so (it can be a per-guest flag hidden
in the management configuration/manual).

Sending this email to gather suggestions (or objections)
to this interface.

I'm not sure I understand the exact difference between the offers.
We can define these 3 options:

1. Qemu/kvm won't make use of tsc scaling feature at all.
2. tsc scaling is used and we take the value either from the host or
    from the live migration data that overrides the later for incoming.
    As you've said, it should be passed through a sub section.
3. Manual setting of the value (uncommon).

Is there another option worth considering?
The questions is what should be the default. IMHO #2 is more
appropriate to serve as a default since we do expect tsc to change
between hosts.

Option 1. is more appropriate to serve as a default given that
modern guests make use of paravirt, as you have observed.

but you also observed that legacy applications that use rdtsc (even
over pv kernel) will still be affected by the physical tsc
frequency. Since I'm not aware of downside for using scaling, I
rather pick opt #2 as a default.

The downside is that, if your destination host does not support tsc
scaling, two possibilities arise:

1) destination tsc frequency > source tsc frequency: TSC trap
2) destination tsc frequency < source tsc frequency: TSC catchup

TSC trapping is not wanted, because it is slow.
This is the downside.

Note Intel does not support tsc scaling.

TSC scaling should happen on default on cpu models that don't support it. As you mention, it's too costly to emulate. At least that should be the default - tsc scaling on in case the processor support it and the opposite.

Eventually, if the processor supports scaling we should enable it and send the sub section on migration.

That is, tsc scaling is only required if the guest does direct RDTSC
on the expectation that the value won't change.

Cheers,
Dor
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux