On Thu, 28 Mar 2024 at 12:36, Linux regression tracking (Thorsten Leemhuis) <regressions@xxxxxxxxxxxxx> wrote: > > [CCing Linus, in case I say something to his disliking] > > On 22.03.24 05:57, Nick Bowler wrote: > > > > Just a friendly reminder that this issue still happens on Linux 6.8 and > > reverting commit 9b2f753ec237 as indicated below is still sufficient to > > resolve the problem. > > FWIW, that commit 9b2f753ec23710 ("sparc64: Fix cpu_possible_mask if > nr_cpus is set") is from v4.8. Reverting it after all that time might > easily lead to even bigger trouble. I'm definitely not reverting a patch from almost a decade ago as a regression. If it took that long to find, it can't be that critical of a regression. So yes, let's treat it as a regular bug. And let's bring in Andreas to the discussion too (although presumably he has seen it on the sparclinux mailing list). Andreas, if not, here's the link to lore for the beginning of the thread: https://lore.kernel.org/all/CADyTPEwt=ZNams+1bpMB1F9w_vUdPsGCt92DBQxxq_VtaLoTdw@xxxxxxxxxxxxxx/ And from a quick look I do think that commit is buggy, and yes, the fix probably is just be to revert it. As the original report makes clear, that commit 9b2f753ec23710 is clearly confused about the difference between "number of CPU's", and "index of CPU numbers". When that smp_fill_in_cpu_possible_map() does int possible_cpus = num_possible_cpus(); and then uses that to fill in &__cpu_possible_mask, that's completely nonsensical. Because we literally have #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask) #define num_possible_cpus() cpumask_weight(cpu_possible_mask) so it's reading cpu_possible_mask to figure out how many cpus it might have, and then using that number to set possibly *different* bits in the same bitmap that is just used to judge what the max number is. So I do think a revert is called for, but I'm not going to treat this as a regression, I'm going to just treat it as "sparc bug" and hope that the sparc people try to figure out why that crazy code was written. And maybe it made more sense back a decade ago than it does now. Andreas? Linus