Kirill, On Fri, Dec 03 2021 at 01:21, Kirill A. Shutemov wrote: > On Thu, Nov 25, 2021 at 01:40:24AM +0100, Thomas Gleixner wrote: >> Kuppuswamy, >> Either that or you provide patches with arguments which are based on >> proper analysis and not on 'appears to' observations. > > I think the right solution to the WBINVD would be to add a #VE handler > that does nothing. We don't have a reasonable way to handle it from within > the guest. We can call the VMM in hope that it would handle it, but VMM is > untrusted and it can ignore the request. > > Dave suggested that we need to do code audit to make sure that there's no > user inside TDX guest environment that relies on WBINVD to work correctly. > > Below is full call tree of WBINVD. It is substantially larger than I > anticipated from initial grep. > > Conclusions: > > - Most of callers are in ACPI code on changing S-states. Ignoring cache > flush for S-state change on virtual machine should be safe. > > - The only WBINVD I was able to trigger is on poweroff from ACPI code. > Reboot also should trigger it, but for some reason I don't see it. > > - Few caller in CPU offline code. TDX does not allowed to offline CPU as > we cannot bring it back -- we don't have SIPI. And even if offline > works for vCPU it should be safe to ignore WBINVD there. > > - NVDIMMs are not supported inside TDX. If it will change we would need > to deal with cache flushing for this case. Hopefully, we would be able > to avoid WBINVD. > > - Cache QoS and MTRR use WBINVD. They are disabled in TDX, but it is > controlled by VMM if the feature is advertised. We would need to > filter CPUID/MSRs to make sure VMM would not mess with them. > > Is it good enough justification for do-nothing #VE WBINVD handler? first of all thank you very much for this very profound analysis. This is really what I was asking for and you probably went even a step deeper than that. Very appreciated. What we should do instead of doing a wholesale let's ignore WBINVD is to have a separate function/macro: ACPI_FLUSH_CPU_CACHE_PHYS() and invoke that from the functions which are considered to be safe. That would default to ACPI_FLUSH_CPU_CACHE() for other architecures obviously. Then you can rightfully do: #define ACPI_FLUSH_CPU_CACHE_PHYS() \ if (!cpu_feature_enabled(XXX)) \ wbinvd(); \ where $XXX might be FEATURE_TDX_GUEST for paranoia sake and then extended to X86_FEATURE_HYPERVISOR if everyone agrees. Then you have the #VE handler which just acts on any other wbinvd invocation via warn, panic, whatever, no? Thanks, tglx