On 14/03/22 13:50, Adrian Hunter wrote: > On 08/03/2022 23:06, Hall, Christopher S wrote: >> Adrian Hunter wrote: >>> On 7.3.2022 16.42, Peter Zijlstra wrote: >>>> On Mon, Mar 07, 2022 at 02:36:03PM +0200, Adrian Hunter wrote: >>>> >>>>>> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c >>>>>> index 4420499f7bb4..a1f179ed39bf 100644 >>>>>> --- a/arch/x86/kernel/paravirt.c >>>>>> +++ b/arch/x86/kernel/paravirt.c >>>>>> @@ -145,6 +145,15 @@ DEFINE_STATIC_CALL(pv_sched_clock, native_sched_clock); >>>>>> >>>>>> void paravirt_set_sched_clock(u64 (*func)(void)) >>>>>> { >>>>>> + /* >>>>>> + * Anything with ART on promises to have sane TSC, otherwise the whole >>>>>> + * ART thing is useless. In order to make ART useful for guests, we >>>>>> + * should continue to use the TSC. As such, ignore any paravirt >>>>>> + * muckery. >>>>>> + */ >>>>>> + if (cpu_feature_enabled(X86_FEATURE_ART)) >>>>> >>>>> Does not seem to work because the feature X86_FEATURE_ART does not seem to get set. >>>>> Possibly because detect_art() excludes anything running on a hypervisor. >>>> >>>> Simple enough to delete that clause I suppose. Christopher, what is >>>> needed to make that go away? I suppose the guest needs to be aware of >>>> the active TSC scaling parameters to make it work ? >>> >>> There is also not X86_FEATURE_NONSTOP_TSC nor values for art_to_tsc_denominator >>> or art_to_tsc_numerator. Also, from the VM's point of view, TSC will jump >>> forwards every VM-Exit / VM-Entry unless the hypervisor changes the offset >>> every VM-Entry, which KVM does not, so it still cannot be used as a stable >>> clocksource. >> >> Translating between ART and the guest TSC can be a difficult problem and ART software >> support is disabled by default in a VM. >> >> There are two major issues translating ART to TSC in a VM: >> >> The range of the TSC scaling field in the VMCS is much larger than the range of values >> that can be represented using CPUID[15H], i.e., it is not possible to communicate this >> to the VM using the current CPUID interface. The range of scaling would need to be >> restricted or another para-virtualized method - preferably OS/hypervisor agnostic - to >> communicate the scaling factor to the guest needs to be invented. >> >> TSC offsetting may also be a problem. The VMCS TSC offset must be discoverable by the >> guest. This can be done via TSC_ADJUST MSR. The offset in the VMCS and the guest >> TSC_ADJUST MSR must always be equivalent, i.e. a write to TSC_ADJUST in the guest >> must be reflected in the VMCS and any changes to the offset in the VMCS must be >> reflected in the TSC_ADJUST MSR. Otherwise a para-virtualized method must >> be invented to communicate an arbitrary VMCS TSC offset to the guest. >> > > In my view it is reasonable for perf to support TSC as a perf clock in any case > because: > a) it allows users to work entirely with TSC if they wish > b) other kernel performance / debug facilities like ftrace already support TSC > c) the patches to add TSC support are relatively small and straight-forward > > May we have support for TSC as a perf event clock? Any update on this?