On Wed, Dec 04, 2019 at 02:51:24PM +0000, Robin Murphy wrote: > On 04/12/2019 11:20 am, Robin Murphy wrote: > > On 2019-12-04 7:28 am, Andreas Färber wrote: > >> Hi YanQing, > >> > >> + LAKML + Mark + Will > >> > >> Am 04.12.19 um 05:55 schrieb Wang YanQing: > >>> I use "perf record" to debug performance issue on RTD1296 SOC, it > >>> does't work, but > >>> the "perf stat" is ok! > >> > >> Thanks for the report - which board, branch and (base) tag are you > >> testing against? And are you building perf yourself from kernel sources, > >> or are you using some distro package? > >> > >> I only have Busybox in my initrd on DS418; I have not tested perf. > >> > >>> After some dig in the kernel, I find the reason is no pmu overflow > >>> interrupt, I think > >>> below pmu configuration isn't right for RTD1296: > >>> " > >>> arm_pmu: arm-pmu { > >>> compatible = "arm,cortex-a53-pmu"; > >>> interrupts = <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>; > >>> }; > >>> " > >>> > >>> We need 4 PMU SPI for RTD1296 (4 cores), and I guess the 48 isn't > >>> right too. > >> > >> Note that above rtd129x.dtsi snippet is not complete. See rtd1296.dtsi: > >> > >> &arm_pmu { > >> interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>; > >> }; > > > > That doesn't help much, since 4 affinities for one SPI is rather > > nonsensical. > > > >> 48 and high/4 match what I see in the latest BSP: > >> > >> https://github.com/BPI-SINOVOIP/BPI-M4-bsp/blob/master/linux-rtk/arch/arm64/boot/dts/realtek/rtd129x/rtd-1296.dtsi#L116 > >> > >> > >>> Any suggestion is welcome. > >>> > >>> Thanks! > >> > >> The only difference I see is "arm,cortex-a53-pmu" vs. "arm,armv8-pmuv3". > >> By my reading of arch/arm64/kernel/perf_event.c the only difference > >> between the two should be the name and an extra cache_map. You could try > >> the other compatible string in your .dts, but I doubt it'll help. > >> > >> Hopefully the Realtek or Arm guys can shed some light. > > > > If the SoC really has all 4 overflow interrupts combined into a single > > SPI line, then sampling just isn't going to be supported - it's > > unreasonably difficult to handle overflow when the IRQ may be taken on > > the wrong CPU. > > On closer inspection, that BSP kernel implements a whole hrtimer-based > bodge in arm_pmu to apparently work around not having usable interrupts, > so yeah, this isn't going to be usable, sorry. > > Robin. Hi all! Thanks for all suggestions and inspection, if we make sure it is a hardware design blunder, then it is accpetable for me, I just need to make sure it isn't the kernel's fault, if so that's will be our fault:) Thanks.