On Wed, 2023-06-07 at 22:43 +0000, Huang, Kai wrote: > On Wed, 2023-06-07 at 07:15 -0700, Hansen, Dave wrote: > > On 6/4/23 07:27, Kai Huang wrote: > > > TDX memory has integrity and confidentiality protections. Violations of > > > this integrity protection are supposed to only affect TDX operations and > > > are never supposed to affect the host kernel itself. In other words, > > > the host kernel should never, itself, see machine checks induced by the > > > TDX integrity hardware. > > > > At the risk of patting myself on the back by acking a changelog that I > > wrote 95% of: > > > > Reviewed-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > > > > Thanks! Hi Dave, Thanks for reviewing and providing the tag. However I found there's a bug if we use early_initcall() to detect erratum here -- in the later kexec() patch, the early_initcall(tdx_init) sets up the x86_platform.memory_shutdown() callback to reset TDX private memory depending on presence of the erratum, but there's no guarantee detecting erratum will be done before tdx_init() because they are both early_initcall(). Kirill also said early_initcall() isn't the right place so I changed to do the detection to earlier phase in bsp_init_intel(), because we just need to match cpu once for BSP assuming CPU model is consistent across all cpus (which is the assumption of x86_match_cpu() anyway). Please let me know for any comments? +/* + * These CPUs have an erratum. A partial write from non-TD + * software (e.g. via MOVNTI variants or UC/WC mapping) to TDX + * private memory poisons that memory, and a subsequent read of + * that memory triggers #MC. + */ +static const struct x86_cpu_id tdx_pw_mce_cpu_ids[] __initconst = { + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, NULL), + X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, NULL), + { } +}; + static void bsp_init_intel(struct cpuinfo_x86 *c) { resctrl_cpu_detect(c); + + if (x86_match_cpu(tdx_pw_mce_cpu_ids)) + setup_force_cpu_bug(X86_BUG_TDX_PW_MCE); }