Patch "EDAC/{skx_common,i10nm}: Fix incorrect far-memory error source indicator" has been added to the 6.11-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    EDAC/{skx_common,i10nm}: Fix incorrect far-memory error source indicator

to the 6.11-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     edac-skx_common-i10nm-fix-incorrect-far-memory-error.patch
and it can be found in the queue-6.11 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 73d22853d915715ba9385ded82c7d69a5a3f7521
Author: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
Date:   Tue Oct 15 15:22:36 2024 +0800

    EDAC/{skx_common,i10nm}: Fix incorrect far-memory error source indicator
    
    [ Upstream commit a36667037a0c0e36c59407f8ae636295390239a5 ]
    
    The Granite Rapids CPUs with Flat2LM memory configurations may
    mistakenly report near-memory errors as far-memory errors, resulting
    in the invalid decoded ADXL results:
    
      EDAC skx: Bad imc -1
    
    Fix this incorrect far-memory error source indicator by prefetching the
    decoded far-memory controller ID, and adjust the error source indicator
    to near-memory if the far-memory controller ID is invalid.
    
    Fixes: ba987eaaabf9 ("EDAC/i10nm: Add Intel Granite Rapids server support")
    Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
    Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
    Tested-by: Diego Garcia Rodriguez <diego.garcia.rodriguez@xxxxxxxxx>
    Link: https://lore.kernel.org/r/20241015072236.24543-3-qiuxu.zhuo@xxxxxxxxx
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c
index 24dd896d9a9d5..4e98fe16f0913 100644
--- a/drivers/edac/i10nm_base.c
+++ b/drivers/edac/i10nm_base.c
@@ -1089,6 +1089,7 @@ static int __init i10nm_init(void)
 		return -ENODEV;
 
 	cfg = (struct res_config *)id->driver_data;
+	skx_set_res_cfg(cfg);
 	res_cfg = cfg;
 
 	rc = skx_get_hi_lo(0x09a2, off, &tolm, &tohm);
diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c
index 42266120ef427..0b8aaf5f77d9f 100644
--- a/drivers/edac/skx_common.c
+++ b/drivers/edac/skx_common.c
@@ -47,6 +47,7 @@ static skx_show_retry_log_f skx_show_retry_rd_err_log;
 static u64 skx_tolm, skx_tohm;
 static LIST_HEAD(dev_edac_list);
 static bool skx_mem_cfg_2lm;
+static struct res_config *skx_res_cfg;
 
 int skx_adxl_get(void)
 {
@@ -135,6 +136,22 @@ static bool skx_adxl_decode(struct decoded_addr *res, enum error_source err_src)
 		return false;
 	}
 
+	/*
+	 * GNR with a Flat2LM memory configuration may mistakenly classify
+	 * a near-memory error(DDR5) as a far-memory error(CXL), resulting
+	 * in the incorrect selection of decoded ADXL components.
+	 * To address this, prefetch the decoded far-memory controller ID
+	 * and adjust the error source to near-memory if the far-memory
+	 * controller ID is invalid.
+	 */
+	if (skx_res_cfg && skx_res_cfg->type == GNR && err_src == ERR_SRC_2LM_FM) {
+		res->imc = (int)adxl_values[component_indices[INDEX_MEMCTRL]];
+		if (res->imc == -1) {
+			err_src = ERR_SRC_2LM_NM;
+			edac_dbg(0, "Adjust the error source to near-memory.\n");
+		}
+	}
+
 	res->socket  = (int)adxl_values[component_indices[INDEX_SOCKET]];
 	if (err_src == ERR_SRC_2LM_NM) {
 		res->imc     = (adxl_nm_bitmap & BIT_NM_MEMCTRL) ?
@@ -191,6 +208,12 @@ void skx_set_mem_cfg(bool mem_cfg_2lm)
 }
 EXPORT_SYMBOL_GPL(skx_set_mem_cfg);
 
+void skx_set_res_cfg(struct res_config *cfg)
+{
+	skx_res_cfg = cfg;
+}
+EXPORT_SYMBOL_GPL(skx_set_res_cfg);
+
 void skx_set_decode(skx_decode_f decode, skx_show_retry_log_f show_retry_log)
 {
 	driver_decode = decode;
diff --git a/drivers/edac/skx_common.h b/drivers/edac/skx_common.h
index 58b8ac62becf5..5a7111791c109 100644
--- a/drivers/edac/skx_common.h
+++ b/drivers/edac/skx_common.h
@@ -241,6 +241,7 @@ int skx_adxl_get(void);
 void skx_adxl_put(void);
 void skx_set_decode(skx_decode_f decode, skx_show_retry_log_f show_retry_log);
 void skx_set_mem_cfg(bool mem_cfg_2lm);
+void skx_set_res_cfg(struct res_config *cfg);
 
 int skx_get_src_id(struct skx_dev *d, int off, u8 *id);
 int skx_get_node_id(struct skx_dev *d, u8 *id);




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux