On Fri, Sep 20, 2024, Nikunj A. Dadhania wrote: > On 9/18/2024 5:37 PM, Sean Christopherson wrote: > > On Mon, Sep 16, 2024, Nikunj A. Dadhania wrote: > >> On 9/13/2024 11:00 PM, Sean Christopherson wrote: > >>>> Signed-off-by: Nikunj A Dadhania <nikunj@xxxxxxx> > >>>> Tested-by: Peter Gonda <pgonda@xxxxxxxxxx> > >>>> --- > >>>> arch/x86/kernel/kvmclock.c | 2 +- > >>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>> > >>>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c > >>>> index 5b2c15214a6b..3d03b4c937b9 100644 > >>>> --- a/arch/x86/kernel/kvmclock.c > >>>> +++ b/arch/x86/kernel/kvmclock.c > >>>> @@ -289,7 +289,7 @@ void __init kvmclock_init(void) > >>>> { > >>>> u8 flags; > >>>> > >>>> - if (!kvm_para_available() || !kvmclock) > >>>> + if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC)) > >>> > >>> I would much prefer we solve the kvmclock vs. TSC fight in a generic way. Unless > >>> I've missed something, the fact that the TSC is more trusted in the SNP/TDX world > >>> is simply what's forcing the issue, but it's not actually the reason why Linux > >>> should prefer the TSC over kvmclock. The underlying reason is that platforms that > >>> support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the > >>> TSC is a superior timesource purely from a functionality perspective. That it's > >>> more secure is icing on the cake. > >> > >> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock > >> should be disabled assuming that timesource is stable and always running? > > > > No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC > > is stable, irrespective of SNP or TDX. This is effectively already done for the > > timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if > > invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the > > kvm_sched_clock_init() code. > > The kvm-clock and tsc-early both are having the rating of 299. As they are of > same rating, kvm-clock is being picked up first. > > Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will > be picked up instead. IMO, it's ugly, but that's a problem with the rating system inasmuch as anything. But the kernel will still be using kvmclock for the scheduler clock, which is undesirable.