Extend error messages to mention CAP_SYS_PERFMON capability as an option to substitute CAP_SYS_ADMIN capability for secure system performance monitoring and observability operations [1]. Make perf_event_paranoid_check() to be aware of CAP_SYS_PERFMON capability. [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html Signed-off-by: Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx> --- tools/perf/design.txt | 3 ++- tools/perf/util/cap.h | 4 ++++ tools/perf/util/evsel.c | 10 +++++----- tools/perf/util/util.c | 1 + 4 files changed, 12 insertions(+), 6 deletions(-) diff --git a/tools/perf/design.txt b/tools/perf/design.txt index 0453ba26cdbd..71755b3e1303 100644 --- a/tools/perf/design.txt +++ b/tools/perf/design.txt @@ -258,7 +258,8 @@ gets schedule to. Per task counters can be created by any user, for their own tasks. A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts -all events on CPU-x. Per CPU counters need CAP_SYS_ADMIN privilege. +all events on CPU-x. Per CPU counters need CAP_SYS_PERFMON or +CAP_SYS_ADMIN privilege. The 'flags' parameter is currently unused and must be zero. diff --git a/tools/perf/util/cap.h b/tools/perf/util/cap.h index 051dc590ceee..0f79fbf6638b 100644 --- a/tools/perf/util/cap.h +++ b/tools/perf/util/cap.h @@ -29,4 +29,8 @@ static inline bool perf_cap__capable(int cap __maybe_unused) #define CAP_SYSLOG 34 #endif +#ifndef CAP_SYS_PERFMON +#define CAP_SYS_PERFMON 38 +#endif + #endif /* __PERF_CAP_H */ diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index f4dea055b080..3a46325e3702 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -2468,14 +2468,14 @@ int perf_evsel__open_strerror(struct evsel *evsel, struct target *target, "You may not have permission to collect %sstats.\n\n" "Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n" "which controls use of the performance events system by\n" - "unprivileged users (without CAP_SYS_ADMIN).\n\n" + "unprivileged users (without CAP_SYS_PERFMON or CAP_SYS_ADMIN).\n\n" "The current value is %d:\n\n" " -1: Allow use of (almost) all events by all users\n" " Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n" - ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN\n" - " Disallow raw tracepoint access by users without CAP_SYS_ADMIN\n" - ">= 1: Disallow CPU event access by users without CAP_SYS_ADMIN\n" - ">= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN\n\n" + ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n" + " Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n" + ">= 1: Disallow CPU event access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n" + ">= 2: Disallow kernel profiling by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n\n" "To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n" " kernel.perf_event_paranoid = -1\n" , target->system_wide ? "system-wide " : "", diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c index 969ae560dad9..9981db0d8d09 100644 --- a/tools/perf/util/util.c +++ b/tools/perf/util/util.c @@ -272,6 +272,7 @@ int perf_event_paranoid(void) bool perf_event_paranoid_check(int max_level) { return perf_cap__capable(CAP_SYS_ADMIN) || + perf_cap__capable(CAP_SYS_PERFMON) || perf_event_paranoid() <= max_level; } -- 2.20.1