This is a note to let you know that I've just added the patch titled EDAC, sb_edac: Fix channel reporting on Knights Landing to the 4.7-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: edac-sb_edac-fix-channel-reporting-on-knights-landing.patch and it can be found in the queue-4.7 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From c5b48fa7e298b9a8968a1c1fc0ef013069ca2dd2 Mon Sep 17 00:00:00 2001 From: Lukasz Odzioba <lukasz.odzioba@xxxxxxxxx> Date: Sat, 23 Jul 2016 01:44:49 +0200 Subject: EDAC, sb_edac: Fix channel reporting on Knights Landing From: Lukasz Odzioba <lukasz.odzioba@xxxxxxxxx> commit c5b48fa7e298b9a8968a1c1fc0ef013069ca2dd2 upstream. On Intel Xeon Phi Knights Landing processor family the channels of the memory controller have untypical arrangement - MC0 is mapped to CH3,4,5 and MC1 is mapped to CH0,1,2. This causes the EDAC driver to report the channel name incorrectly. We missed this change earlier, so the code already contains similar comment, but the translation function is incorrect. Without this patch: errors in DIMM_A and DIMM_D were reported in DIMM_D errors in DIMM_B and DIMM_E were reported in DIMM_E errors in DIMM_C and DIMM_F were reported in DIMM_F Correct this. Hubert Chrzaniuk: - rebased to 4.8 - comments and code cleanup Fixes: d0cdf9003140 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support") Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx> Cc: Mauro Carvalho Chehab <mchehab@xxxxxxxxxx> Cc: Hubert Chrzaniuk <hubert.chrzaniuk@xxxxxxxxx> Cc: linux-edac <linux-edac@xxxxxxxxxxxxxxx> Cc: lukasz.anaczkowski@xxxxxxxxx Cc: lukasz.odzioba@xxxxxxxxx Cc: mchehab@xxxxxxxxxx Link: http://lkml.kernel.org/r/1469231089-22837-1-git-send-email-lukasz.odzioba@xxxxxxxxx Signed-off-by: Lukasz Odzioba <lukasz.odzioba@xxxxxxxxx> [ Boris: Simplify a bit by removing char mc. ] Signed-off-by: Borislav Petkov <bp@xxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/edac/sb_edac.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) --- a/drivers/edac/sb_edac.c +++ b/drivers/edac/sb_edac.c @@ -552,9 +552,9 @@ static const struct pci_id_table pci_dev /* Knight's Landing Support */ /* * KNL's memory channels are swizzled between memory controllers. - * MC0 is mapped to CH3,5,6 and MC1 is mapped to CH0,1,2 + * MC0 is mapped to CH3,4,5 and MC1 is mapped to CH0,1,2 */ -#define knl_channel_remap(channel) ((channel + 3) % 6) +#define knl_channel_remap(mc, chan) ((mc) ? (chan) : (chan) + 3) /* Memory controller, TAD tables, error injection - 2-8-0, 2-9-0 (2 of these) */ #define PCI_DEVICE_ID_INTEL_KNL_IMC_MC 0x7840 @@ -1286,7 +1286,7 @@ static u32 knl_get_mc_route(int entry, u mc = GET_BITFIELD(reg, entry*3, (entry*3)+2); chan = GET_BITFIELD(reg, (entry*2) + 18, (entry*2) + 18 + 1); - return knl_channel_remap(mc*3 + chan); + return knl_channel_remap(mc, chan); } /* @@ -2997,8 +2997,15 @@ static void sbridge_mce_output_error(str } else { char A = *("A"); - channel = knl_channel_remap(channel); + /* + * Reported channel is in range 0-2, so we can't map it + * back to mc. To figure out mc we check machine check + * bank register that reported this error. + * bank15 means mc0 and bank16 means mc1. + */ + channel = knl_channel_remap(m->bank == 16, channel); channel_mask = 1 << channel; + snprintf(msg, sizeof(msg), "%s%s err_code:%04x:%04x channel:%d (DIMM_%c)", overflow ? " OVERFLOW" : "", Patches currently in stable-queue which might be from lukasz.odzioba@xxxxxxxxx are queue-4.7/edac-sb_edac-fix-channel-reporting-on-knights-landing.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html