On Tue, Mar 28 2023 at 20:57, Usama Arif wrote: > The APs will then take turns through the real mode code (which has its > own bitlock for exclusion) until they make it to their own stack, then > proceed through the first few lines of start_secondary() and execute > these parts in parallel: > > start_secondary() > -> cr4_init() > -> (some 32-bit only stuff so not in the parallel cases) > -> cpu_init_secondary() > -> cpu_init_exception_handling() > -> cpu_init() > -> wait_for_master_cpu() > > At this point they wait for the BSP to set their bit in cpu_callout_mask > (from do_wait_cpu_initialized()), and release them to continue through > the rest of cpu_init() and beyond. That's actually broken on SMT enabled machines when microcode needs to be updated. Lets look at a 2 core, 4 thread system, where CPU0/2 and CPU1/3 are the sibling pairs. CPU 0: CPU1 CPU2 CPU3 for_each_present_cpu(cpu) cpu_up(cpu, KICK_AP_ALIVE); startup() wait() startup() wait() Release CPU1 load_ucode() startup() wait() So that violates the rules of microcode loading that the sibling must be in a state where it does not execute anything which might be affected by the microcode update. The fragile startup code does not really qualify as such a state :) Thanks, tglx