OK. I'll try to use tracing when needed. The hardware I was using for the test was Malta-R with 34K bitfile programed into the FPGA. The CPU frequency is 50MHz. Deng-Cheng 2011/1/28 David Daney <ddaney@xxxxxxxxxxxxxxxxxx>: > On 01/26/2011 10:24 PM, Deng-Cheng Zhu wrote: >> >> Using your attached patch, I experimented -c and -F by 'ls /'. The numbers >> I used are 10, 1000 and 100000 for both -c and -F. >> >> The number of samples I got was 24 all the way. That means the event >> period >> to sample and the profiling frequency do not affect the results on MIPS32 >> platform. While working on the old code, the system had the following >> results: >> >> -c 10: The system seems busy dealing with interrupts. And the following >> log >> was printed out: >> ================================================ >> hda: ide_dma_sff_timer_expiry: DMA status (0x24) >> hda: DMA interrupt recovery >> hda: lost interrupt >> ================================================ >> This does need to be fixed later on. >> -c 1000: ~11085 samples >> -c 100000: ~48 samples ('perf report' still showed some data.) >> -F 10: ~118 samples >> -F 1000: ~352 samples >> -F 100000: ~379 samples >> >> I'll try to take time to look into the patch to see if anything can be >> changed. >> > > I have found it useful to enable tracing, and then placing trace_printk() in > mipspmu_event_set_period() to look at the values of: > > sample_period, period_left that are being used. > > Also you could use a trace_printk() in mipsxx_pmu_write_counter() to check > the value being written to the register. > > What hardware are you using to test this? I wonder if there is a board with > a 32-bit CPU that I could get access to. > > David Daney > > >> >> Deng-Cheng >> >> >> 2011/1/26 David Daney<ddaney@xxxxxxxxxxxxxxxxxx>: >>> >>> On 01/24/2011 07:42 PM, Deng-Cheng Zhu wrote: >>>> >>>> Hi, David >>>> >>>> >>>> This version does fix the problem with 'perf stat'. However, when >>>> working >>>> with 'perf record', the following happened: >>>> >>>> -sh-4.0# perf record -f -e cycles -e instructions -e branches \ >>>>> >>>>> -e branch-misses -e r12 find / -name "*sys*">/dev/null >>>> >>>> [ perf record: Woken up 1 times to write data ] >>>> [ perf record: Captured and wrote 0.001 MB perf.data (~53 samples) ] >>> >>> >>> I get the same thing. What happens if you supply either '-c xxx' or '-f >>> xxx'? >>> >>> I get:octeon:~/linux/tools/perf# ./perf record -e cycles /bin/ls -l / >>> total 100 >>> drwxr-xr-x 2 root root 4096 2010-11-12 11:39 bin >>> [...] >>> drwxr-xr-x 13 root root 4096 2007-05-25 12:28 var >>> [ perf record: Woken up 1 times to write data ] >>> [ perf record: Captured and wrote 0.002 MB perf.data (~82 samples) ] >>> >>> Almost no samples as you got. >>> >>> But if I do: >>> >>> octeon:~/linux/tools/perf# ./perf record -F 100000 -e cycles /bin/ls -l / >>> total 100 >>> drwxr-xr-x 2 root root 4096 2010-11-12 11:39 bin >>> [...] >>> drwxr-xr-x 13 root root 4096 2007-05-25 12:28 var >>> [ perf record: Woken up 1 times to write data ] >>> [ perf record: Captured and wrote 0.404 MB perf.data (~17653 samples) ] >>> >>> Look many more samples! >>> >>> The question is, what is it supposed to do? >>> >>> If you can get a reasonable number of samples out if you supply -c or >>> -F, then I would argue that it is working and the default settings for >>> -F are not a good fit for your test case. >>> >>> I have slightly changed the patch. You could try the attached version >>> instead and tell me the results. >>> >>> >>> David Daney >>> >>> >>> >> > >