On Sun, Mar 13, 2022 at 09:22:21AM +0800, guoren@xxxxxxxxxx wrote: > From: Guo Ren <guoren@xxxxxxxxxxxxxxxxx> > > These patch_text implementations are using stop_machine_cpuslocked > infrastructure with atomic cpu_count. The original idea: When the > master CPU patch_text, the others should wait for it. I couldn't find the original intent in the commit logs (at least not in the arm64 logs). Maybe the intention was for the CPUs to wait for the text patching to complete rather than the master CPU to wait for the others to enter the cpu_relax() loop before patching. I think your patch makes sense anyway, the master CPU would wait for all the others to enter the cpu_relax() loop before patching and releasing them with another increment. You probably wouldn't see any issues in practice unless you insert probes in the multi_stop_cpu() function (or we could mark this function as __kprobes and get rid of the extra loops entirely). > --- a/arch/arm64/kernel/patching.c > +++ b/arch/arm64/kernel/patching.c > @@ -117,8 +117,8 @@ static int __kprobes aarch64_insn_patch_text_cb(void *arg) > int i, ret = 0; > struct aarch64_insn_patch *pp = arg; > > - /* The first CPU becomes master */ > - if (atomic_inc_return(&pp->cpu_count) == 1) { > + /* The last CPU becomes master */ > + if (atomic_inc_return(&pp->cpu_count) == num_online_cpus()) { > for (i = 0; ret == 0 && i < pp->insn_cnt; i++) > ret = aarch64_insn_patch_text_nosync(pp->text_addrs[i], > pp->new_insns[i]); For arm64: Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx> -- Catalin