Hi Adrian, On Thu, Aug 15, 2019 at 11:54:54AM +0300, Adrian Hunter wrote: [...] > > diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config > > index e4988f49ea79..d7ff839d8b20 100644 > > --- a/tools/perf/Makefile.config > > +++ b/tools/perf/Makefile.config > > @@ -48,9 +48,20 @@ ifeq ($(SRCARCH),x86) > > NO_PERF_REGS := 0 > > endif > > > > +ARM_PRE_START_SIZE := 0 > > + > > ifeq ($(SRCARCH),arm) > > NO_PERF_REGS := 0 > > LIBUNWIND_LIBS = -lunwind -lunwind-arm > > + ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),) > > + # Extract info from lds: > > + # . = ((0xC0000000)) + 0x00208000; > > + # ARM_PRE_START_SIZE := 0x00208000 > > + ARM_PRE_START_SIZE := $(shell egrep ' \. \= \({2}0x[0-9a-fA-F]+\){2}' \ > > + $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \ > > + sed -e 's/[(|)|.|=|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*//' | \ > > + awk -F' ' '{printf "0x%x", $$2}' 2>/dev/null) > > + endif > > endif > > > > ifeq ($(SRCARCH),arm64) > > @@ -58,8 +69,19 @@ ifeq ($(SRCARCH),arm64) > > NO_SYSCALL_TABLE := 0 > > CFLAGS += -I$(OUTPUT)arch/arm64/include/generated > > LIBUNWIND_LIBS = -lunwind -lunwind-aarch64 > > + ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),) > > + # Extract info from lds: > > + # . = ((((((((0xffffffffffffffff)) - (((1)) << (48)) + 1) + (0)) + (0x08000000))) + (0x08000000))) + 0x00080000; > > + # ARM_PRE_START_SIZE := (0x08000000 + 0x08000000 + 0x00080000) = 0x10080000 > > + ARM_PRE_START_SIZE := $(shell egrep ' \. \= \({8}0x[0-9a-fA-F]+\){2}' \ > > + $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \ > > + sed -e 's/[(|)|.|=|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*//' | \ > > + awk -F' ' '{printf "0x%x", $$6+$$7+$$8}' 2>/dev/null) > > + endif > > So, that is not going to work if you take a perf.data file to a non-arm machine? Yeah, this patch will only allow perf to work correctly when perf run natively on arm/arm64, so it can resolve partial of the issue. > How come you cannot use kallsyms to get the information? Thanks for pointing out this. Sorry I skipped your comment "I don't know how you intend to calculate ARM_PRE_START_SIZE" when you reviewed the patch v3, I should use that chance to elaborate the detailed idea and so can get more feedback/guidance before procceed. Actually, I have considered to use kallsyms when worked on the previous patch set. As mentioned in patch set v4's cover letter, I tried to implement machine__create_extra_kernel_maps() for arm/arm64, the purpose is to parse kallsyms so can find more kernel maps and thus also can fixup the kernel start address. But I found the 'perf script' tool directly calls machine__get_kernel_start() instead of running into the flow for machine__create_extra_kernel_maps(); so I finally gave up to use machine__create_extra_kernel_maps() for tweaking kernel start address and went back to use this patch's approach by parsing lds files. So for next step, I want to get some guidances: - One method is to add a new weak function, e.g. arch__fix_kernel_text_start(), then every arch can implement its own function to fixup the kernel start address; For arm/arm64, can use kallsyms to find the symbols with least address and fixup for kernel start address. - Another method is to directly parse kallsyms in the function machine__get_kernel_start(), thus the change can be used for all archs; Seems to me the second method is to address this issue as a common issue crossing all archs. But not sure if this is the requirement for all archs or just this is only required for arm/arm64. Please let me know what's your preference or other thoughts. Thanks a lot! Leo. > > endif > > > > +CFLAGS += -DARM_PRE_START_SIZE=$(ARM_PRE_START_SIZE) > > + > > ifeq ($(SRCARCH),csky) > > NO_PERF_REGS := 0 > > endif > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > index f6ee7fbad3e4..e993f891bb82 100644 > > --- a/tools/perf/util/machine.c > > +++ b/tools/perf/util/machine.c > > @@ -2687,13 +2687,26 @@ int machine__get_kernel_start(struct machine *machine) > > machine->kernel_start = 1ULL << 63; > > if (map) { > > err = map__load(map); > > + if (err) > > + return err; > > + > > /* > > * On x86_64, PTI entry trampolines are less than the > > * start of kernel text, but still above 2^63. So leave > > * kernel_start = 1ULL << 63 for x86_64. > > */ > > - if (!err && !machine__is(machine, "x86_64")) > > + if (!machine__is(machine, "x86_64")) > > machine->kernel_start = map->start; > > + > > + /* > > + * On arm/arm64, the kernel uses some memory regions which are > > + * prior to '_stext' symbol; to reflect the complete kernel > > + * address space, compensate these pre-defined regions for > > + * kernel start address. > > + */ > > + if (!strcmp(perf_env__arch(machine->env), "arm") || > > + !strcmp(perf_env__arch(machine->env), "arm64")) > > + machine->kernel_start -= ARM_PRE_START_SIZE; > > } > > return err; > > } > > >