On Thu, Jun 25, 2020 at 5:14 PM Song Liu <songliubraving@xxxxxx> wrote: > > Introduce helper bpf_get_task_stack(), which dumps stack trace of given > task. This is different to bpf_get_stack(), which gets stack track of > current task. One potential use case of bpf_get_task_stack() is to call > it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file. > > bpf_get_task_stack() uses stack_trace_save_tsk() instead of > get_perf_callchain() for kernel stack. The benefit of this choice is that > stack_trace_save_tsk() doesn't require changes in arch/. The downside of > using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the > stack trace to unsigned long array. For 32-bit systems, we need to > translate it to u64 array. > > Signed-off-by: Song Liu <songliubraving@xxxxxx> > --- Looks great, I just think that there are cases where user doesn't necessarily has valid task_struct pointer, just pid, so would be nice to not artificially restrict such cases by having extra helper. Acked-by: Andrii Nakryiko <andriin@xxxxxx> > include/linux/bpf.h | 1 + > include/uapi/linux/bpf.h | 35 ++++++++++++++- > kernel/bpf/stackmap.c | 79 ++++++++++++++++++++++++++++++++-- > kernel/trace/bpf_trace.c | 2 + > scripts/bpf_helpers_doc.py | 2 + > tools/include/uapi/linux/bpf.h | 35 ++++++++++++++- > 6 files changed, 149 insertions(+), 5 deletions(-) > [...] > + /* stack_trace_save_tsk() works on unsigned long array, while > + * perf_callchain_entry uses u64 array. For 32-bit systems, it is > + * necessary to fix this mismatch. > + */ > + if (__BITS_PER_LONG != 64) { > + unsigned long *from = (unsigned long *) entry->ip; > + u64 *to = entry->ip; > + int i; > + > + /* copy data from the end to avoid using extra buffer */ > + for (i = entry->nr - 1; i >= (int)init_nr; i--) > + to[i] = (u64)(from[i]); doing this forward would be just fine as well, no? First iteration will cast and overwrite low 32-bits, all the subsequent iterations won't even overlap. > + } > + > +exit_put: > + put_callchain_entry(rctx); > + > + return entry; > +} > + [...] > +BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf, > + u32, size, u64, flags) > +{ > + struct pt_regs *regs = task_pt_regs(task); > + > + return __bpf_get_stack(regs, task, buf, size, flags); > +} So this takes advantage of BTF and having a direct task_struct pointer. But for kprobes/tracepoint I think it would also be extremely helpful to be able to request stack trace by PID. How about one more helper which will wrap this one with get/put task by PID, e.g., bpf_get_pid_stack(int pid, void *buf, u32 size, u64 flags)? Would that be a problem? > + > +static int bpf_get_task_stack_btf_ids[5]; > +const struct bpf_func_proto bpf_get_task_stack_proto = { > + .func = bpf_get_task_stack, > + .gpl_only = false, > + .ret_type = RET_INTEGER, > + .arg1_type = ARG_PTR_TO_BTF_ID, > + .arg2_type = ARG_PTR_TO_UNINIT_MEM, > + .arg3_type = ARG_CONST_SIZE_OR_ZERO, > + .arg4_type = ARG_ANYTHING, > + .btf_id = bpf_get_task_stack_btf_ids, > +}; > + [...]