Re: [PATCH v15 03/12] cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2023-03-20 at 15:30 +0100, Thomas Gleixner wrote:
> 
> This causes a subtle issue. The bringup loop above moves all CPUs to
> cpuhp_state == CPUHP_BP_PARALLEL_DYN_END. So the serial bootup will
> start from there and bring them fully up.
> 
> Now if a bringup fails, then the rollback will only go back down to
> CPUHP_BP_PARALLEL_DYN_END, which means that the control CPU won't do any
> cleanups below CPUHP_BP_PARALLEL_DYN_END.
> 
> That 'fail' is a common case for SMT soft disable via the 'nosmt'
> command line parameter. Due to the marvelous MCE broadcast 'feature' we
> need to bringup the SMT siblings at least to the CPUHP_AP_ONLINE_IDLE
> state once and then roll them back.
> 
> While this is not necessarily a fatal problem, it's changing behaviour
> and with quite some of the details hidden in the (then not issued)
> teardown callbacks might cause some hard to decode subtle surprises.
> 
> So that second for_each_present_cpu() loop needs to check the return
> value of cpu_up() and issue a full rollback to CPUHP_OFFLINE in case of
> fail.

@@ -1524,8 +1531,22 @@ void bringup_nonboot_cpus(unsigned int setup_max_cpus)
        for_each_present_cpu(cpu) {
                if (num_online_cpus() >= setup_max_cpus)
                        break;
-               if (!cpu_online(cpu))
-                       cpu_up(cpu, CPUHP_ONLINE);
+               if (!cpu_online(cpu)) {
+                       int ret = cpu_up(cpu, CPUHP_ONLINE);
+
+                       /*
+                        * For the parallel bringup case, roll all the way back
+                        * to CPUHP_OFFLINE on failure; don't leave them in the
+                        * parallel stages. This happens in the nosmt case for
+                        * non-primary threads.
+                        */
+                       if (ret && cpuhp_hp_states[CPUHP_BP_PARALLEL_DYN].name) {
+                               struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
+                               if (can_rollback_cpu(st))
+                                       WARN_ON(cpuhp_invoke_callback_range(false, cpu, st,
+                                                                           CPUHP_OFFLINE));
+                       }
+               }
        }
 }
 

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux