On 12/5/2019 8:15 AM, Alexey Budankov wrote: > Currently access to perf_events functionality [1] beyond the scope permitted > by perf_event_paranoid [1] kernel setting is allowed to a privileged process > [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3]. > > This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance > monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its > governing role for perf_events based performance monitoring of a system. > > CAP_SYS_PERFMON aims to harden system security and integrity when monitoring > performance using perf_events subsystem by processes and Perf privileged users > [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN > privileged processes [3]. Are there use cases where you would need CAP_SYS_PERFMON where you would not also need CAP_SYS_ADMIN? If you separate a new capability from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction with the new capability it is all rather pointless. The scope you've defined for this CAP_SYS_PERFMON is very small. Is there a larger set of privilege checks that might be applicable for it? > > CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to > performance monitoring functionality of perf_events and balance amount of > CAP_SYS_ADMIN credentials in accordance with the recommendations provided in > the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded; > see Notes to kernel developers, below." > > For backward compatibility reasons performance monitoring functionality of > perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for > secure performance monitoring use cases is discouraged with respect to the > introduced CAP_SYS_PERFMON capability. > > In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users > [2] to conduct secure performance monitoring using perf_events in the scope > of available online CPUs when executing code in kernel and user modes. > > Possible alternative solution to this capabilities balancing, system security > hardening task could be to use the existing CAP_SYS_PTRACE capability to govern > perf_events' performance monitoring functionality, since process debugging is > similar to performance monitoring with respect to providing insights into > process memory and execution details. However CAP_SYS_PTRACE still provides > users with more credentials than are required for secure performance monitoring > using perf_events subsystem and this excess is avoided by using the dedicated > CAP_SYS_PERFMON capability. > > libcap library utilities [4], [5] and Perf tool can be used to apply > CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope > permitted by system wide perf_event_paranoid kernel setting and below are the > steps to evaluate the advancement suggested by the patch set: > > - patch, build and boot the kernel > - patch, build Perf tool e.g. to /home/user/perf > ... > # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap > # pushd libcap > # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3] > # make > # pushd progs > # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf > # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf > /home/user/perf: OK > # ./getcap /home/user/perf > /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep > # echo 2 > /proc/sys/kernel/perf_event_paranoid > # cat /proc/sys/kernel/perf_event_paranoid > 2 > ... > $ /home/user/perf top > ... works as expected ... > $ cat /proc/`pidof perf`/status > Name: perf > Umask: 0002 > State: S (sleeping) > Tgid: 2958 > Ngid: 0 > Pid: 2958 > PPid: 9847 > TracerPid: 0 > Uid: 500 500 500 500 > Gid: 500 500 500 500 > FDSize: 256 > ... > CapInh: 0000000000000000 > CapPrm: 0000004400080000 > CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 > cap_sys_perfmon,cap_sys_ptrace,cap_syslog > CapBnd: 0000007fffffffff > CapAmb: 0000000000000000 > NoNewPrivs: 0 > Seccomp: 0 > Speculation_Store_Bypass: thread vulnerable > Cpus_allowed: ff > Cpus_allowed_list: 0-7 > ... > > Usage of cap_sys_perfmon effectively avoids unused credentials excess: > - with cap_sys_admin: > CapEff: 0000007fffffffff => 01111111 11111111 11111111 11111111 11111111 > - with cap_sys_perfmon: > CapEff: 0000004400080000 => 01000100 00000000 00001000 00000000 00000000 > 38 34 19 > sys_perfmon syslog sys_ptrace > > The patch set is for tip perf/core repository: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core > tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6 > > [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html > [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html > [3] http://man7.org/linux/man-pages/man7/capabilities.7.html > [4] http://man7.org/linux/man-pages/man8/setcap.8.html > [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git > [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf > > --- > Alexey Budankov (3): > capabilities: introduce CAP_SYS_PERFMON to kernel and user space > perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring > perf tool: extend Perf tool with CAP_SYS_PERFMON support > > include/linux/perf_event.h | 6 ++++-- > include/uapi/linux/capability.h | 10 +++++++++- > security/selinux/include/classmap.h | 4 ++-- > tools/perf/design.txt | 3 ++- > tools/perf/util/cap.h | 4 ++++ > tools/perf/util/evsel.c | 10 +++++----- > tools/perf/util/util.c | 15 +++++++++++++-- > 7 files changed, 39 insertions(+), 13 deletions(-) >