On 10/29/2024 2:45 PM, Xiaoyao Li wrote: > On 10/29/2024 11:56 AM, Nikunj A. Dadhania wrote: >> >> >> On 10/29/2024 8:32 AM, Xiaoyao Li wrote: >>> On 10/28/2024 1:34 PM, Nikunj A Dadhania wrote: >>>> Calibrating the TSC frequency using the kvmclock is not correct for >>>> SecureTSC enabled guests. Use the platform provided TSC frequency via the >>>> GUEST_TSC_FREQ MSR (C001_0134h). >>>> >>>> Signed-off-by: Nikunj A Dadhania <nikunj@xxxxxxx> >>>> --- >>>> arch/x86/include/asm/sev.h | 2 ++ >>>> arch/x86/coco/sev/core.c | 16 ++++++++++++++++ >>>> arch/x86/kernel/tsc.c | 5 +++++ >>>> 3 files changed, 23 insertions(+) >>>> >>>> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h >>>> index d27c4e0f9f57..9ee63ddd0d90 100644 >>>> --- a/arch/x86/include/asm/sev.h >>>> +++ b/arch/x86/include/asm/sev.h >>>> @@ -536,6 +536,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code >>>> } >>>> void __init snp_secure_tsc_prepare(void); >>>> +void __init snp_secure_tsc_init(void); >>>> #else /* !CONFIG_AMD_MEM_ENCRYPT */ >>>> @@ -584,6 +585,7 @@ static inline int handle_guest_request(struct snp_msg_desc *mdesc, u64 exit_code >>>> u32 resp_sz) { return -ENODEV; } >>>> static inline void __init snp_secure_tsc_prepare(void) { } >>>> +static inline void __init snp_secure_tsc_init(void) { } >>>> #endif /* CONFIG_AMD_MEM_ENCRYPT */ >>>> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c >>>> index 140759fafe0c..0be9496b8dea 100644 >>>> --- a/arch/x86/coco/sev/core.c >>>> +++ b/arch/x86/coco/sev/core.c >>>> @@ -3064,3 +3064,19 @@ void __init snp_secure_tsc_prepare(void) >>>> pr_debug("SecureTSC enabled"); >>>> } >>>> + >>>> +static unsigned long securetsc_get_tsc_khz(void) >>>> +{ >>>> + unsigned long long tsc_freq_mhz; >>>> + >>>> + setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ); >>>> + rdmsrl(MSR_AMD64_GUEST_TSC_FREQ, tsc_freq_mhz); >>>> + >>>> + return (unsigned long)(tsc_freq_mhz * 1000); >>>> +} >>>> + >>>> +void __init snp_secure_tsc_init(void) >>>> +{ >>>> + x86_platform.calibrate_cpu = securetsc_get_tsc_khz; >>>> + x86_platform.calibrate_tsc = securetsc_get_tsc_khz; >>>> +} >>>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c >>>> index dfe6847fd99e..730cbbd4554e 100644 >>>> --- a/arch/x86/kernel/tsc.c >>>> +++ b/arch/x86/kernel/tsc.c >>>> @@ -30,6 +30,7 @@ >>>> #include <asm/i8259.h> >>>> #include <asm/topology.h> >>>> #include <asm/uv/uv.h> >>>> +#include <asm/sev.h> >>>> unsigned int __read_mostly cpu_khz; /* TSC clocks / usec, not used here */ >>>> EXPORT_SYMBOL(cpu_khz); >>>> @@ -1514,6 +1515,10 @@ void __init tsc_early_init(void) >>>> /* Don't change UV TSC multi-chassis synchronization */ >>>> if (is_early_uv_system()) >>>> return; >>>> + >>>> + if (cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC)) >>>> + snp_secure_tsc_init(); >>> >>> IMHO, it isn't the good place to call snp_secure_tsc_init() to update the callbacks here. >>> >>> It's better to be called in some snp init functions. >> >> As part of setup_arch(), init_hypervisor_platform() gets called and all the PV clocks >> are registered and initialized as part of init_platform callback. Once the hypervisor >> platform is initialized, tsc_early_init() is called. SEV SNP guest can be running on >> any hypervisor, so the call back needs to be updated either in tsc_early_init() or >> init_hypervisor_platform(), as the change is TSC related, I have updated it here. > > I think it might be due to > > 1. it lacks a central place for SNP related stuff, like tdx_early_init() sme_early_init() does the init for SEV/SNP related stuff, but this is not the right place to do TSC callback inits as kvmclock will over-ride it. > 2. even we have some place of 1), the callbacks will be overwrote in init_hypervisor_platform() by specific PV ops. > > However, I don't think it's good practice to update it tsc.c. The reason why callback is used is that arch/hypervisor specific code can implement > and overwrite with it's own implementation in its own file. > > Back to your case, I think a central snp init function would be helpful, and we can introduce a new flag to skip the overwrite of tsc/cpu calibration for hypervisor when the flag is set. That again touches all the hypervisor (KVM, Xen, HyperV and VMWare). We wanted to move this to common code as suggested by Sean. Regards, Nikunj