On Thu, Oct 10, 2019 at 10:12:51AM +0200, Peter Zijlstra wrote: > On Wed, Oct 09, 2019 at 04:36:57PM -0400, Joel Fernandes (Google) wrote: > > In currentl mainline, the degree of access to perf_event_open(2) system > > call depends on the perf_event_paranoid sysctl. This has a number of > > limitations: > > > > 1. The sysctl is only a single value. Many types of accesses are controlled > > based on the single value thus making the control very limited and > > coarse grained. > > 2. The sysctl is global, so if the sysctl is changed, then that means > > all processes get access to perf_event_open(2) opening the door to > > security issues. > > > > This patch adds LSM and SELinux access checking which will be used in > > Android to access perf_event_open(2) for the purposes of attaching BPF > > programs to tracepoints, perf profiling and other operations from > > userspace. These operations are intended for production systems. > > > > 5 new LSM hooks are added: > > 1. perf_event_open: This controls access during the perf_event_open(2) > > syscall itself. The hook is called from all the places that the > > perf_event_paranoid sysctl is checked to keep it consistent with the > > systctl. The hook gets passed a 'type' argument which controls CPU, > > kernel and tracepoint accesses (in this context, CPU, kernel and > > tracepoint have the same semantics as the perf_event_paranoid sysctl). > > Additionally, I added an 'open' type which is similar to > > perf_event_paranoid sysctl == 3 patch carried in Android and several other > > distros but was rejected in mainline [1] in 2016. > > > > 2. perf_event_alloc: This allocates a new security object for the event > > which stores the current SID within the event. It will be useful when > > the perf event's FD is passed through IPC to another process which may > > try to read the FD. Appropriate security checks will limit access. > > > > 3. perf_event_free: Called when the event is closed. > > > > 4. perf_event_read: Called from the read(2) system call path for the event. > > + mmap() > > > > 5. perf_event_write: Called from the read(2) system call path for the event. > > - read() + ioctl() Fixed. > > fresh from the keyboard.. but maybe consoldate things a little. Looks great to me, I folded it into the patch. Thanks Peter! Just one comment on change in existing logic of the code, below: [snip] > --- a/arch/x86/events/intel/p4.c > +++ b/arch/x86/events/intel/p4.c > @@ -8,7 +8,6 @@ > */ > > #include <linux/perf_event.h> > -#include <linux/security.h> > > #include <asm/perf_event_p4.h> > #include <asm/hardirq.h> > @@ -777,10 +776,7 @@ static int p4_validate_raw_event(struct > * the user needs special permissions to be able to use it > */ > if (p4_ht_active() && p4_event_bind_map[v].shared) { > - if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN)) > - return -EACCES; > - > - v = security_perf_event_open(&event->attr, PERF_SECURITY_CPU); > + v = perf_allow_cpu(&event->attr); > if (v) > return v; > } > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -56,6 +56,7 @@ struct perf_guest_info_callbacks { > #include <linux/perf_regs.h> > #include <linux/cgroup.h> > #include <linux/refcount.h> > +#include <linux/security.h> > #include <asm/local.h> > > struct perf_callchain_entry { > @@ -1244,19 +1245,28 @@ extern int perf_cpu_time_max_percent_han > int perf_event_max_stack_handler(struct ctl_table *table, int write, > void __user *buffer, size_t *lenp, loff_t *ppos); > > -static inline bool perf_paranoid_tracepoint_raw(void) > +static inline int perf_allow_kernel(struct perf_event_attr *attr) > { > - return sysctl_perf_event_paranoid > -1; > + if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN)) > + return -EACCES; > + > + return security_perf_event_open(attr, PERF_SECURITY_KERNEL); > } > > -static inline bool perf_paranoid_cpu(void) > +static inline int perf_allow_cpu(struct perf_event_attr *attr) > { > - return sysctl_perf_event_paranoid > 0; > + if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN)) > + return -EACCES; > + > + return security_perf_event_open(attr, PERF_SECURITY_CPU); > } > > -static inline bool perf_paranoid_kernel(void) > +static inline int perf_allow_tracepoint(struct perf_event_attr *attr) > { > - return sysctl_perf_event_paranoid > 1; > + if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN)) > + return -EPERM; > + Here the sysctl check of > -1 also is now coupled with a CAP_SYS_ADMIN check. However.. > + return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT); > } > > extern void perf_event_init(void); > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -4229,10 +4229,7 @@ find_get_context(struct pmu *pmu, struct > > if (!task) { > /* Must be root to operate on a CPU event: */ > - if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN)) > - return ERR_PTR(-EACCES); > - > - err = security_perf_event_open(&event->attr, PERF_SECURITY_CPU); > + err = perf_allow_cpu(&event->attr); > if (err) > return ERR_PTR(err); > > @@ -5862,14 +5859,8 @@ static int perf_mmap(struct file *file, > lock_limit >>= PAGE_SHIFT; > locked = atomic64_read(&vma->vm_mm->pinned_vm) + extra; > > - if (locked > lock_limit) { > - if (perf_paranoid_tracepoint_raw() && !capable(CAP_IPC_LOCK)) { > - ret = -EPERM; > - goto unlock; > - } > - > - ret = security_perf_event_open(&event->attr, > - PERF_SECURITY_TRACEPOINT); > + if (locked > lock_limit && !capable(CAP_IPC_LOCK)) { > + ret = perf_allow_tracepoint(&event->attr); In previous code, this check did not involve a check for CAP_SYS_ADMIN. I am Ok with adding the CAP_SYS_ADMIN check as well which does make sense to me for tracepoint access. But it is still a change in the logic so I wanted to bring it up. Let me know any other thoughts and then I'll post a new patch. thanks, - Joel [snip]