Dear list, I'm looking for a way to compare the performance of two different codes inside Kernel. I was able to do some comparison on user land but I want to test the specific portion of code inside Kernel. At line 1195 of drivers/media/video/videobuf2-core.c: /* * Reinitialize all buffers for next use. */ for (i = 0; i < q->num_buffers; ++i) q->bufs[i]->state = VB2_BUF_STATE_DEQUEUED; With: /* buf2 */ /* * Reinitialize all buffers for next use. */ buf_ptr_end = q->bufs[q->num_buffers]; for (buf_ptr = q->bufs[0]; buf_ptr < buf_ptr_end; ++buf_ptr) buf_ptr->state = VB2_BUF_STATE_DEQUEUED; To test on user land I've created two separate C source codes and compiled with gcc -O2, then used the "perf" tool on the entire application. With num_buffers = 131072: $ perf stat -e cycles,stalled-cycles-frontend,stalled-cycles-backend,cache-references,cache-misses -r 2048 ./buf1 Performance counter stats for './buf1' (2048 runs): 16,538,039 cycles #0.000 GHz (+-0.06%)[80.23%] 6,917,411 stalled-cycles-frontend#41.83% frontend cycles idle(+-0.14%)[80.25%] 4,686,384 stalled-cycles-backend #28.34% backend cycles idle(+-0.14%)[80.28%] 148,990 cache-references (+-0.38%)[80.24%] 71,180 cache-misses #47.775 % of all cache refs (+-0.22%)[88.14%] 0.005234340 seconds time elapsed $ perf stat -e cycles,stalled-cycles-frontend,stalled-cycles-backend,cache-references,cache-misses -r 2048 ./buf2 Performance counter stats for './buf2' (2048 runs): 14,740,563 cycles #0.000 GHz (+-0.04%)[77.89%] 5,187,716 stalled-cycles-frontend#35.19% frontend cycles idle(+-0.14%)[77.81%] 3,383,748 stalled-cycles-backend # 101,894 cache-references (+-0.23%)[84.60%] 66,647 cache-misses #65.408 % of all cache refs (+-0.14%)[90.52%] 0.004661826 seconds time elapsed (+-0.06%) But I want to repeat the tests on specific portion of code, not on entire application. Is there a safe way of do something like: start_bench ( ?? ); /* start measurement */ buf_ptr_end = q->bufs[q->num_buffers]; for (buf_ptr = q->bufs[0]; buf_ptr < buf_ptr_end; ++buf_ptr) buf_ptr->state = VB2_BUF_STATE_DEQUEUED; end_bench ( ?? ); /* end measurement */ And is this the correct approach for testing the performance of specific portion of Kernel code? Thank you! Peter -- Peter Senna Tschudin peter.senna@xxxxxxxxx gpg id: 48274C36 _______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies