Patch "sched/psi: report zeroes for CPU full at the system level" has been added to the 5.15-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    sched/psi: report zeroes for CPU full at the system level

to the 5.15-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     sched-psi-report-zeroes-for-cpu-full-at-the-system-l.patch
and it can be found in the queue-5.15 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit f55e8294c5e7ea6f5aca9ea78afe2976a02df9ba
Author: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
Date:   Fri Apr 8 20:19:14 2022 +0800

    sched/psi: report zeroes for CPU full at the system level
    
    [ Upstream commit 890d550d7dbac7a31ecaa78732aa22be282bb6b8 ]
    
    Martin find it confusing when look at the /proc/pressure/cpu output,
    and found no hint about that CPU "full" line in psi Documentation.
    
    % cat /proc/pressure/cpu
    some avg10=0.92 avg60=0.91 avg300=0.73 total=933490489
    full avg10=0.22 avg60=0.23 avg300=0.16 total=358783277
    
    The PSI_CPU_FULL state is introduced by commit e7fcd7622823
    ("psi: Add PSI_CPU_FULL state"), which mainly for cgroup level,
    but also counted at the system level as a side effect.
    
    Naturally, the FULL state doesn't exist for the CPU resource at
    the system level. These "full" numbers can come from CPU idle
    schedule latency. For example, t1 is the time when task wakeup
    on an idle CPU, t2 is the time when CPU pick and switch to it.
    The delta of (t2 - t1) will be in CPU_FULL state.
    
    Another case all processes can be stalled is when all cgroups
    have been throttled at the same time, which unlikely to happen.
    
    Anyway, CPU_FULL metric is meaningless and confusing at the
    system level. So this patch will report zeroes for CPU full
    at the system level, and update psi Documentation accordingly.
    
    Fixes: e7fcd7622823 ("psi: Add PSI_CPU_FULL state")
    Reported-by: Martin Steigerwald <Martin.Steigerwald@xxxxxxxxx>
    Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx>
    Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
    Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
    Link: https://lore.kernel.org/r/20220408121914.82855-1-zhouchengming@xxxxxxxxxxxxx
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/Documentation/accounting/psi.rst b/Documentation/accounting/psi.rst
index 860fe651d645..5e40b3f437f9 100644
--- a/Documentation/accounting/psi.rst
+++ b/Documentation/accounting/psi.rst
@@ -37,11 +37,7 @@ Pressure interface
 Pressure information for each resource is exported through the
 respective file in /proc/pressure/ -- cpu, memory, and io.
 
-The format for CPU is as such::
-
-	some avg10=0.00 avg60=0.00 avg300=0.00 total=0
-
-and for memory and IO::
+The format is as such::
 
 	some avg10=0.00 avg60=0.00 avg300=0.00 total=0
 	full avg10=0.00 avg60=0.00 avg300=0.00 total=0
@@ -58,6 +54,9 @@ situation from a state where some tasks are stalled but the CPU is
 still doing productive work. As such, time spent in this subset of the
 stall state is tracked separately and exported in the "full" averages.
 
+CPU full is undefined at the system level, but has been reported
+since 5.13, so it is set to zero for backward compatibility.
+
 The ratios (in %) are tracked as recent trends over ten, sixty, and
 three hundred second windows, which gives insight into short term events
 as well as medium and long term trends. The total absolute stall time
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 422f3b0445cf..cad2a1b34ed0 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1062,14 +1062,17 @@ int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res)
 	mutex_unlock(&group->avgs_lock);
 
 	for (full = 0; full < 2; full++) {
-		unsigned long avg[3];
-		u64 total;
+		unsigned long avg[3] = { 0, };
+		u64 total = 0;
 		int w;
 
-		for (w = 0; w < 3; w++)
-			avg[w] = group->avg[res * 2 + full][w];
-		total = div_u64(group->total[PSI_AVGS][res * 2 + full],
-				NSEC_PER_USEC);
+		/* CPU FULL is undefined at the system level */
+		if (!(group == &psi_system && res == PSI_CPU && full)) {
+			for (w = 0; w < 3; w++)
+				avg[w] = group->avg[res * 2 + full][w];
+			total = div_u64(group->total[PSI_AVGS][res * 2 + full],
+					NSEC_PER_USEC);
+		}
 
 		seq_printf(m, "%s avg10=%lu.%02lu avg60=%lu.%02lu avg300=%lu.%02lu total=%llu\n",
 			   full ? "full" : "some",



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux