Patch "powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs" has been added to the 6.10-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs

to the 6.10-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     powercap-intel_rapl-fix-the-energy-pkg-event-for-amd.patch
and it can be found in the queue-6.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit fa1a0bfaff0e93129d8151b8d8515bb6d7607a73
Author: Dhananjay Ugwekar <Dhananjay.Ugwekar@xxxxxxx>
Date:   Tue Jul 30 04:49:19 2024 +0000

    powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs
    
    [ Upstream commit 26096aed255fbac9501718174dbb24c935d8854e ]
    
    After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
    on AMD processors that support extended CPUID leaf 0x80000026, the
    topology_logical_die_id() macros, no longer returns package id, instead it
    returns the CCD (Core Complex Die) id. This leads to the energy-pkg
    event scope to be modified to CCD instead of package.
    
    For more historical context, please refer to commit 32fb480e0a2c
    ("powercap/intel_rapl: Support multi-die/package"), which initially changed
    the RAPL scope from package to die for all systems, as Intel systems
    with Die enumeration have RAPL scope as die, and those without die
    enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
    correctly with topology_logical_die_id() until recently, but this changed
    after the "0x80000026 leaf" commit mentioned above.
    
    Future multi-die Intel systems will have package scope RAPL counters,
    but they will be using TPMI RAPL interface, which is not affected by
    this change.
    
    Replacing topology_logical_die_id() with topology_physical_package_id()
    conditionally only for AMD and Hygon fixes the energy-pkg event.
    
    On an AMD 2 socket 8 CCD Zen4 server:
    
    Before:
    
    linux$ ls /sys/class/powercap/
    intel-rapl      intel-rapl:4    intel-rapl:8:0  intel-rapl:d
    intel-rapl:0    intel-rapl:4:0  intel-rapl:9    intel-rapl:d:0
    intel-rapl:0:0  intel-rapl:5    intel-rapl:9:0  intel-rapl:e
    intel-rapl:1    intel-rapl:5:0  intel-rapl:a    intel-rapl:e:0
    intel-rapl:1:0  intel-rapl:6    intel-rapl:a:0  intel-rapl:f
    intel-rapl:2    intel-rapl:6:0  intel-rapl:b    intel-rapl:f:0
    intel-rapl:2:0  intel-rapl:7    intel-rapl:b:0
    intel-rapl:3    intel-rapl:7:0  intel-rapl:c
    intel-rapl:3:0  intel-rapl:8    intel-rapl:c:0
    
    After:
    
    linux$ ls /sys/class/powercap/
    intel-rapl  intel-rapl:0  intel-rapl:0:0  intel-rapl:1  intel-rapl:1:0
    
    Only one sysfs entry per-event per-package is created after this change.
    
    Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
    Reported-by: Michael Larabel <michael@xxxxxxxxxxxxxxxxxx>
    Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@xxxxxxx>
    Reviewed-by: Zhang Rui <rui.zhang@xxxxxxxxx>
    Link: https://patch.msgid.link/20240730044917.4680-3-Dhananjay.Ugwekar@xxxxxxx
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index d51d4ec8d707c..28bc6f85b6c87 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -2129,6 +2129,21 @@ void rapl_remove_package(struct rapl_package *rp)
 }
 EXPORT_SYMBOL_GPL(rapl_remove_package);
 
+/*
+ * RAPL Package energy counter scope:
+ * 1. AMD/HYGON platforms use per-PKG package energy counter
+ * 2. For Intel platforms
+ *	2.1 CLX-AP platform has per-DIE package energy counter
+ *	2.2 Other platforms that uses MSR RAPL are single die systems so the
+ *          package energy counter can be considered as per-PKG/per-DIE,
+ *          here it is considered as per-DIE.
+ *	2.3 New platforms that use TPMI RAPL doesn't care about the
+ *	    scope because they are not MSR/CPU based.
+ */
+#define rapl_msrs_are_pkg_scope()				\
+	(boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||	\
+	 boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
+
 /* caller to ensure CPU hotplug lock is held */
 struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
 							 bool id_is_cpu)
@@ -2136,8 +2151,14 @@ struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_
 	struct rapl_package *rp;
 	int uid;
 
-	if (id_is_cpu)
-		uid = topology_logical_die_id(id);
+	if (id_is_cpu) {
+		uid = rapl_msrs_are_pkg_scope() ?
+		      topology_physical_package_id(id) : topology_logical_die_id(id);
+		if (uid < 0) {
+			pr_err("topology_logical_(package/die)_id() returned a negative value");
+			return ERR_PTR(-EINVAL);
+		}
+	}
 	else
 		uid = id;
 
@@ -2169,9 +2190,14 @@ struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *pr
 		return ERR_PTR(-ENOMEM);
 
 	if (id_is_cpu) {
-		rp->id = topology_logical_die_id(id);
+		rp->id = rapl_msrs_are_pkg_scope() ?
+			 topology_physical_package_id(id) : topology_logical_die_id(id);
+		if ((int)(rp->id) < 0) {
+			pr_err("topology_logical_(package/die)_id() returned a negative value");
+			return ERR_PTR(-EINVAL);
+		}
 		rp->lead_cpu = id;
-		if (topology_max_dies_per_package() > 1)
+		if (!rapl_msrs_are_pkg_scope() && topology_max_dies_per_package() > 1)
 			snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
 				 topology_physical_package_id(id), topology_die_id(id));
 		else




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux