On Fri, Jan 31, 2025 at 06:32:04PM -0800, Vishal Annapurve wrote: > On Fri, Jan 31, 2025 at 12:13 AM Kirill A. Shutemov > <kirill@xxxxxxxxxxxxx> wrote: > > > > On Thu, Jan 30, 2025 at 11:45:01AM -0800, Vishal Annapurve wrote: > > > On Thu, Jan 30, 2025 at 10:48 AM Kirill A. Shutemov > > > <kirill@xxxxxxxxxxxxx> wrote: > > > > ... > > > > > > > > > > > > I think it is worth to putting this into a separate patch and not > > > > > > backport. The rest of the patch is bugfix and this doesn't belong. > > > > > > > > > > > > Otherwise, looks good to me: > > > > > > > > > > > > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>@linux.intel.com> > > > > > > > > > > > > -- > > > > > > Kiryl Shutsemau / Kirill A. Shutemov > > > > > > > > > > Thanks Kirill for the review. > > > > > > > > > > Thinking more about this fix, now I am wondering why the efforts [1] > > > > > to move halt/safe_halt under CONFIG_PARAVIRT were abandoned. Currently > > > > > proposed fix is incomplete as it would not handle scenarios where > > > > > CONFIG_PARAVIRT_XXL is disabled. I am tilting towards reviving [1] and > > > > > requiring CONFIG_PARAVIRT for TDX VMs. WDYT? > > > > > > > > > > [1] https://lore.kernel.org/lkml/20210517235008.257241-1-sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx/ > > > > > > > > Many people dislike paravirt callbacks. We tried to avoid relying on them > > > > for core TDX enabling. > > > > > > > > Can you explain the issue you see with CONFIG_PARAVIRT_XXL being disabled? > > > > I don't think I follow. > > > > > > Relevant callers of *_safe_halt() are: > > > 1) kvm_wait() -> safe_halt() -> raw_safe_halt() -> arch_safe_halt() > > > > Okay, I didn't realized that CONFIG_PARAVIRT_SPINLOCKS doesn't depend on > > CONFIG_PARAVIRT_XXL. > > > > It would be interesting to check if paravirtualized spinlocks make sense > > for TDX given the cost of TD exit. > > > > Maybe we should avoid advertising KVM_FEATURE_PV_UNHALT to the TDX guests? > > > > Are you hinting towards a model where TDX guest prohibits such call > sites from being configured? I am not sure if it's a sustainable model > if we just rely on the host not advertising these features as the > guest kernel can still add new paths that are not controlled by the > host that lead to *_safe_halt(). I've asked TDX module folks to provide additional information in ve_info to help handle STI shadow correctly. They will implement it, but it will take some time. So we need some kind of stopgap until we have it. I am reluctant to commit to paravirt calls for this workaround. They will likely stick forever. It is possible, I would like to avoid them. If not, oh well. > > > 2) acpi_safe_halt() -> safe_halt() -> raw_safe_halt() -> arch_safe_halt() > > > > Have you checked why you get there? I don't see a reason for TDX guest to > > get into ACPI idle stuff. We don't have C-states to manage. > > Apparently userspace VMM is advertising pblock_address through SSDT > tables in my configuration which causes guests to enable ACPI cpuidle > drivers. Do you know if future generations of TDX hardware will not > support different c-states for TDX VMs? I have very limited understanding of power management, but I don't see how C-states can be meaningfully supported by any virtualized environment. To me, C-states only make sense for baremetal. -- Kiryl Shutsemau / Kirill A. Shutemov