On Wed, 28 Aug 2019 15:08:28 -0700 Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > On Wed, Aug 28, 2019 at 09:14:21AM +0200, Peter Zijlstra wrote: > > On Tue, Aug 27, 2019 at 04:01:08PM -0700, Andy Lutomirski wrote: > > > > > > Tracing: > > > > > > > > CAP_BPF and perf_paranoid_tracepoint_raw() (which is kernel.perf_event_paranoid == -1) > > > > are necessary to: > > > > That's not tracing, that's perf. > > > re: your first comment above. > I'm not sure what difference you see in words 'tracing' and 'perf'. > I really hope we don't partition the overall tracing category > into CAP_PERF and CAP_FTRACE only because these pieces are maintained > by different people. I think Peter meant: It's not tracing, it's profiling. And there is a bit of separation between the two, although there is an overlap. Yes, perf can do tracing but it's designed more for profiling. > On one side perf_event_open() isn't really doing tracing (as step by > step ftracing of function sequences), but perf_event_open() opens > an event and the sequence of events (may include IP) becomes a trace. > imo CAP_TRACING is the best name to descibe the privileged space > of operations possible via perf_event_open, ftrace, kprobe, stack traces, etc. I have no issue with what you suggest. I guess it comes down to how fine grain people want to go. Do we want it to be all or nothing? Should CAP_TRACING allow for write access to tracefs? Or should we go with needing both CAP_TRACING and permissions in that directory (like changing the group ownership of the files at every boot). Perhaps we should have a CAP_TRACING_RO, that gives read access to tracefs (and write if the users have permissions). And have CAP_TRACING to allow full write access as well (allowing for users to add kprobe events and enabling tracers like the function tracer). > > Another reason are kuprobes. They can be crated via perf_event_open > and via tracefs. Are they in CAP_PERF or in CAP_FTRACE ? In both, right? > Should then CAP_KPROBE be used ? that would be an overkill. > It would partition the space even further without obvious need. > > Looking from BPF angle... BPF doesn't have integration with ftrace yet. > bpf_trace_printk is using ftrace mechanism, but that's 1% of ftrace. > In the long run I really like to see bpf using all of ftrace. > Whereas bpf is using a lot of 'perf'. > And extending some perf things in bpf specific way. > Take a look at how BPF_F_STACK_BUILD_ID. It's clearly perf/stack_tracing > feature that generic perf can use one day. > Currently it sits in bpf land and accessible via bpf only. > Though its bpf only today I categorize it under CAP_TRACING. > > I think CAP_TRACING privilege should allow task to do all of perf_event_open, > kuprobe, stack trace, ftrace, and kallsyms. > We can think of some exceptions that should stay under CAP_SYS_ADMIN, > but most of the functionality available by 'perf' binary should be > usable with CAP_TRACING. 'perf' can do bpf too. > With CAP_BPF it would be all set. As the above seems to favor the idea of CAP_TRACING allowing write access to tracefs, should we have a CAP_TRACING_RO for just read access and limited perf abilities? -- Steve