On 01/18/2017 12:20 PM, Greg Kroah-Hartman wrote: > On Wed, Jan 18, 2017 at 11:55:58AM -0500, Prarit Bhargava wrote: >> >> >> On 01/18/2017 11:33 AM, Greg Kroah-Hartman wrote: >>> On Wed, Jan 18, 2017 at 09:38:07AM -0500, Prarit Bhargava wrote: >>>> >>>> >>>> On 01/18/2017 05:45 AM, Greg Kroah-Hartman wrote: >>>>> 4.9-stable review patch. If anyone has any objections, please let me know. >>>>> >>>> >>>> Nack. >>>> >>>> The value of boot_cpu_data.logical_proc_id may be uninitialized and set to >>>> default -1 on systems that pick a random core as boot cpu. This was >>>> inadvertently fixed by 9d85eb9119f4 ("x86/smpboot: Make logical package >>>> management more robust") which is in 4.10-rc1. >>>> >>>> Before 9d85eb9119f4: >>>> >>>> [ 3.971539] hswep_uncore_cpu_init: cpu 5 pkg 0 boot_cpu_data.logical_proc_id >>>> 65535 >>>> [ 3.976504] hswep_uncore_cpu_init: cpu 5 pkg 0 cpu_data(0).logical_proc_id 0 >>>> >>>> After 9d85eb9119f4: >>>> >>>> [ 3.919112] hswep_uncore_cpu_init: cpu 5 pkg 0 boot_cpu_data.logical_proc_id 0 >>>> [ 3.923391] hswep_uncore_cpu_init: cpu 5 pkg 0 cpu_data(0).logical_proc_id 0 >>>> >>>> This patch should not be applied to any stable branch. >>> >>> So the fixes: line lies? This isn't needed at all for 4.9? >> >> No, the fixes: line does not lie. This patch is not needed at all for 4.9. >> Other patches are required beyond this patch in order for 4.9 to remain stable. >> >> This patch is 6d6daa20945f ("perf/x86/intel/uncore: Fix hardcoded socket 0 >> assumption in the Haswell init code") which was tested on and applied to >> 4.10-rc3 IIRC. This patch was applied to 4.10-rc4. >> >> [prarit@prarit linux]$ git describe --contains 6d6daa20945f >> v4.10-rc4~9^2~5 > > Yes, but the patch says it is fixing a bug since 4.6-rc1. The fixes > line says: > Fixes: cf6d445f6897 ("perf/x86/uncore: Track packages, not per CPU data") > > $ git dc cf6d445f6897 > v4.6-rc1~165^2~28 > > (dc is my git alias for "describe --contains" as I type it so often...) > >> 4.9 is broken and requires additional patches beyond this patch. Applying this >> patch to 4.9 stable without those additional fixes will result in kernel panics >> on some Haswell systems that boot on random cores. > > So, does 4.9 on its own work properly on these systems? If not, what > are the commits that are needed to fix it? The system is being used by someone else atm. I know of at least two commits that are required but I have not tested them on 4.9 or 4.9 stable. I do know that 4.10-rc2-ish was broken on these systems when booting on any other cpu than cpu 0. > > If 4.9 is fine as-is, great, we should drop this patch then, correct? > But then that fixes: line lied :( Let me see if I can explain it this way: (The commits are displayed as commit hash, version that contains commit, and my own description) cf6d445f6897 v4.6-rc1~165^2~28 <- what my commit "Fixes:" d49597fd3bc v4.9-rc6~33^2 <- required for 9d85eb9119f, causes boot_cpu_data.logical_proc_id is -1. 9d85eb9119f v4.10-rc1~51^2~8 <- This patch unintentionally fixes code so that boot_cpu_data.logical_proc_id is greater than or equal to 0. 6d6daa20945f v4.10-rc4~9^2~5 <- my commit, expects boot_cpu_data.logical_proc_id greater than or equal to 0. cf6d445f6897 v4.6-rc1~165^2~28 caused the bug to happen. When I found the bug the upstream tree was at 4.10-rc2-ish. My fix for the bug is 6d6daa20945f v4.10-rc4~9^2~5 which has "Fixes: cf6d445f6897" because that commit causes this to happen. If my commit 6d6daa20945f v4.10-rc4~9^2~5 is applied to 4.9 stable without 9d85eb9119f v4.10-rc1~51^2~8 things will be *worse* because boot_cpu_data.logical_proc_id is -1, and this will result in a NULL pointer panic during boot on *all* Haswell systems. If we take d49597fd3bc, 9d85eb9119f, and 6d6daa20945f things should work but as I said above, I have not built or tested that. I can test this combination of patches if you think it is worthwhile for 4.9 stable. I think the fixes: line is correct. The git bisect points at that commit. The problem with backporting my patch to 4.9 is that my fix requires other patches that are not in 4.9. P. -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html