Mao Han <han_mao@xxxxxxxxx> 於 2019年8月29日 週四 下午2:57寫道: > > This patch set add perf callchain(FP/DWARF) support for RISC-V. > It comes from the csky version callchain support with some > slight modifications. The patchset base on Linux 5.3-rc6. > > Changes since v5: > - use walk_stackframe from stacktrace.c to handle > kernel callchain unwinding(fix invalid mem access) > > Changes since v4: > - Add missing PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET > verified with extra CFLAGS(-Wall -Werror) > > Changes since v3: > - Add more strict check for unwind_frame_kernel > - update for kernel 5.3 > > Changes since v2: > - fix inconsistent comment > - force to build kernel with -fno-omit-frame-pointer if perf > event is enabled > > Changes since v1: > - simplify implementation and code convention > > Cc: Paul Walmsley <paul.walmsley@xxxxxxxxxx> > Cc: Greentime Hu <green.hu@xxxxxxxxx> > Cc: Palmer Dabbelt <palmer@xxxxxxxxxx> > Cc: linux-riscv <linux-riscv@xxxxxxxxxxxxxxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxx> > Cc: Guo Ren <guoren@xxxxxxxxxx> > > Mao Han (3): > riscv: Add perf callchain support > riscv: Add support for perf registers sampling > riscv: Add support for libdw > > arch/riscv/Kconfig | 2 + > arch/riscv/Makefile | 3 + > arch/riscv/include/uapi/asm/perf_regs.h | 42 ++++++++++++ > arch/riscv/kernel/Makefile | 4 +- > arch/riscv/kernel/perf_callchain.c | 95 ++++++++++++++++++++++++++ > arch/riscv/kernel/perf_regs.c | 44 ++++++++++++ > arch/riscv/kernel/stacktrace.c | 2 +- > tools/arch/riscv/include/uapi/asm/perf_regs.h | 42 ++++++++++++ > tools/perf/Makefile.config | 6 +- > tools/perf/arch/riscv/Build | 1 + > tools/perf/arch/riscv/Makefile | 4 ++ > tools/perf/arch/riscv/include/perf_regs.h | 96 +++++++++++++++++++++++++++ > tools/perf/arch/riscv/util/Build | 2 + > tools/perf/arch/riscv/util/dwarf-regs.c | 72 ++++++++++++++++++++ > tools/perf/arch/riscv/util/unwind-libdw.c | 57 ++++++++++++++++ > 15 files changed, 469 insertions(+), 3 deletions(-) > create mode 100644 arch/riscv/include/uapi/asm/perf_regs.h > create mode 100644 arch/riscv/kernel/perf_callchain.c > create mode 100644 arch/riscv/kernel/perf_regs.c > create mode 100644 tools/arch/riscv/include/uapi/asm/perf_regs.h > create mode 100644 tools/perf/arch/riscv/Build > create mode 100644 tools/perf/arch/riscv/Makefile > create mode 100644 tools/perf/arch/riscv/include/perf_regs.h > create mode 100644 tools/perf/arch/riscv/util/Build > create mode 100644 tools/perf/arch/riscv/util/dwarf-regs.c > create mode 100644 tools/perf/arch/riscv/util/unwind-libdw.c > Tested-by: Greentime Hu <greentime.hu@xxxxxxxxxx> I tested this patchset based on v5.3-rc6 and it can use dwarf or fp to backtrace in Unleashed board. # perf record -e cpu-clock --call-graph dwarf ls -l / total 4 drwxr-xr-x 2 root root 0 Aug 26 2019 bin drwxr-xr-x 5 root root 12720 Jan 1 00:00 dev drwxr-xr-x 5 root root 0 Jan 1 00:00 etc -rwxr-xr-x 1 root root 178 Aug 26 2019 init drwxr-xr-x 2 root root 0 Aug 26 2019 lib lrwxrwxrwx 1 root root 3 Aug 19 2019 lib64 -> lib lrwxrwxrwx 1 root root 11 Aug 19 2019 linuxrc -> bin/busybox drwxr-xr-x 2 root root 0 Aug 19 2019 media drwxr-xr-x 2 root root 0 Aug 19 2019 mnt drwxr-xr-x 2 root root 0 Aug 19 2019 opt dr-xr-xr-x 66 root root 0 Jan 1 00:00 proc drwx------ 3 root root 0 Jan 1 00:01 root drwxr-xr-x 3 root root 140 Jan 1 00:00 run drwxr-xr-x 2 root root 0 Aug 19 2019 sbin dr-xr-xr-x 11 root root 0 Jan 1 00:00 sys drwxrwxrwt 2 root root 60 Jan 1 00:00 tmp drwxr-xr-x 6 root root 0 Aug 26 2019 usr drwxr-xr-x 4 root root 0 Aug 26 2019 var [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.175 MB perf.data (21 samples) ] # perf record -e cpu-clock --call-graph fp ls -l / total 4 drwxr-xr-x 2 root root 0 Aug 26 2019 bin drwxr-xr-x 5 root root 12720 Jan 1 00:00 dev drwxr-xr-x 5 root root 0 Jan 1 00:00 etc -rwxr-xr-x 1 root root 178 Aug 26 2019 init drwxr-xr-x 2 root root 0 Aug 26 2019 lib lrwxrwxrwx 1 root root 3 Aug 19 2019 lib64 -> lib lrwxrwxrwx 1 root root 11 Aug 19 2019 linuxrc -> bin/busybox drwxr-xr-x 2 root root 0 Aug 19 2019 media drwxr-xr-x 2 root root 0 Aug 19 2019 mnt drwxr-xr-x 2 root root 0 Aug 19 2019 opt dr-xr-xr-x 66 root root 0 Jan 1 00:00 proc drwx------ 3 root root 0 Jan 1 00:00 root drwxr-xr-x 3 root root 140 Jan 1 00:00 run drwxr-xr-x 2 root root 0 Aug 19 2019 sbin dr-xr-xr-x 11 root root 0 Jan 1 00:00 sys drwxrwxrwt 2 root root 60 Jan 1 00:00 tmp drwxr-xr-x 6 root root 0 Aug 26 2019 usr drwxr-xr-x 4 root root 0 Aug 26 2019 var [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.004 MB perf.data (19 samples) ] # perf test 1: vmlinux symtab matches kallsyms : Skip 2: Detect openat syscall event : FAILED! 3: Detect openat syscall event on all cpus : FAILED! 4: Read samples using the mmap interface : FAILED! 5: Test data source output : Ok 6: Parse event definition strings : FAILED! 7: Simple expression parser : Ok 8: PERF_RECORD_* events & perf_sample fields : FAILED! 9: Parse perf pmu format : Ok 10: DSO data read : Ok 11: DSO data cache : Ok 12: DSO data reopen : Ok 13: Roundtrip evsel->name : Ok 14: Parse sched tracepoints fields : Ok 15: syscalls:sys_enter_openat event fields : FAILED! 16: Setup struct perf_event_attr : Skip 17: Match and link multiple hists : Ok 18: 'import perf' in python : FAILED! 19: Breakpoint overflow signal handler : FAILED! 20: Breakpoint overflow sampling : FAILED! 21: Breakpoint accounting : Skip 22: Watchpoint : 22.1: Read Only Watchpoint : FAILED! 22.2: Write Only Watchpoint : FAILED! 22.3: Read / Write Watchpoint : FAILED! 22.4: Modify Watchpoint : FAILED! 23: Number of exit events of a simple workload : Ok 24: Software clock events period values : Ok 25: Object code reading : Ok 26: Sample parsing : Ok 27: Use a dummy software event to keep tracking: Ok 28: Parse with no sample_id_all bit set : Ok 29: Filter hist entries : Ok 30: Lookup mmap thread : Ok 31: Share thread mg : Ok 32: Sort output of hist entries : Ok 33: Cumulate child hist entries : Ok 34: Track with sched_switch : FAILED! 35: Filter fds with revents mask in a fdarray : Ok 36: Add fd to a fdarray, making it autogrow : Ok 37: kmod_path__parse : Ok 38: Thread map : Ok 39: LLVM search and compile : 39.1: Basic BPF llvm compile : Skip 39.2: kbuild searching : Skip 39.3: Compile source for BPF prologue generation: Skip 39.4: Compile source for BPF relocation : Skip 40: Session topology : FAILED! 41: BPF filter : 41.1: Basic BPF filtering : Skip 41.2: BPF pinning : Skip 41.3: BPF prologue generation : Skip 41.4: BPF relocation checker : Skip 42: Synthesize thread map : Ok 43: Remove thread map : Ok 44: Synthesize cpu map : Ok 45: Synthesize stat config : Ok 46: Synthesize stat : Ok 47: Synthesize stat round : Ok 48: Synthesize attr update : Ok 49: Event times : Ok 50: Read backward ring buffer : Skip 51: Print cpu map : Ok 52: Probe SDT events : Skip 53: is_printable_array : Ok 54: Print bitmap : Ok 55: perf hooks : Ok 56: builtin clang support : Skip (not compiled in) 57: unit_number__scnprintf : Ok 58: mem2node : Ok 59: time utils : Ok 60: map_groups__merge_in : Ok #