After enabling CONFIG_IRQ_TIME_ACCOUNTING to track IRQ pressure in our container environment, we encountered several user-visible behavioral changes: - Interrupted IRQ/softirq time is not accounted for in the cpuacct cgroup This breaks userspace applications that rely on CPU usage data from cgroups to monitor CPU pressure. This patchset resolves the issue by ensuring that IRQ/softirq time is accounted for in the cgroup of the interrupted tasks. - getrusage(2) does not include time interrupted by IRQ/softirq Some services use getrusage(2) to check if workloads are experiencing CPU pressure. Since IRQ/softirq time is no longer charged to task runtime, getrusage(2) can no longer reflect the CPU pressure caused by heavy interrupts. This patchset addresses the first issue, which is relatively straightforward. However, the second issue remains unresolved, as there might be debate over whether interrupted time should be considered part of a task’s usage. Nonetheless, it is important to report interrupted time to the user via some metric, though that is a separate discussion. Changes: v3->v4: - Rebase v2->v3: - Add a helper account_irqtime() to avoid redundant code (Johannes) v1->v2: https://lore.kernel.org/cgroups/20241008061951.3980-1-laoar.shao@xxxxxxxxx/ - Fix lockdep issues reported by kernel test robot <oliver.sang@xxxxxxxxx> v1: https://lore.kernel.org/all/20240923090028.16368-1-laoar.shao@xxxxxxxxx/ Yafang Shao (4): sched: Define sched_clock_irqtime as static key sched: Don't account irq time if sched_clock_irqtime is disabled sched, psi: Don't account irq time if sched_clock_irqtime is disabled sched: Fix cgroup irq accounting for CONFIG_IRQ_TIME_ACCOUNTING kernel/sched/core.c | 77 +++++++++++++++++++++++++++++------------- kernel/sched/cputime.c | 16 ++++----- kernel/sched/psi.c | 11 ++---- kernel/sched/sched.h | 1 + kernel/sched/stats.h | 7 ++-- 5 files changed, 68 insertions(+), 44 deletions(-) -- 2.43.5