On Mon, May 29, 2023 at 11:31:29PM +0300, Kirill A. Shutemov wrote: > On Mon, May 29, 2023 at 09:27:13PM +0200, Thomas Gleixner wrote: > > On Mon, May 29 2023 at 05:39, Kirill A. Shutemov wrote: > > > On Sat, May 27, 2023 at 03:40:02PM +0200, Thomas Gleixner wrote: > > > But it gets broken again on "x86/smpboot: Implement a bit spinlock to > > > protect the realmode stack" with > > > > > > [ 0.554079] .... node #0, CPUs: #1 #2 > > > [ 0.738071] Callback from call_rcu_tasks() invoked. > > > [ 10.562065] CPU2 failed to report alive state > > > [ 10.566337] #3 > > > [ 20.570066] CPU3 failed to report alive state > > > [ 20.574268] #4 > > > ... > > > > > > Notably CPU1 is missing from "failed to report" list. So CPU1 takes the > > > lock fine, but seems never unlocks it. > > > > > > Maybe trampoline_lock(%rip) in head_64.S somehow is not the same as > > > &tr_lock in trampoline_64.S. I donno. > > > > It's definitely the same in the regular startup (16bit mode), but TDX > > starts up via: > > > > trampoline_start64 > > trampoline_compat > > LOAD_REALMODE_ESP <- lock > > > > That place cannot work with that LOAD_REALMODE_ESP macro. The untested > > below should cure it. > > Yep, works for me. > > Aaand the next patch that breaks TDX boot is... <drum roll> > > x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it > > Disabling parallel bringup helps. I didn't look closer yet. If you have > an idea let me know. Okay, it crashes around .Lread_apicid due to touching MSRs that trigger #VE. Looks like the patch had no intention to enable parallel bringup on TDX. + * Intel-TDX has a secure RDMSR hypercall, but that needs to be + * implemented seperately in the low level startup ASM code. But CC_ATTR_GUEST_STATE_ENCRYPT that used to filter it out is SEV-ES-specific thingy and doesn't cover TDX. I don't think we have an attribute that fits nicely here. -- Kiryl Shutsemau / Kirill A. Shutemov