* Avi Kivity <avi@xxxxxxxxxx> wrote: > On 04/11/2010 12:37 PM, Jason Garrett-Glaser wrote: > > > >># time x264 --crf 20 --quiet crowd_run_2160p.y4m -o /dev/null --threads 2 > >>yuv4mpeg: 3840x2160@50/1fps, 1:1 > >> > >>encoded 500 frames, 0.68 fps, 251812.80 kb/s > >> > >>real 12m17.154s > >>user 20m39.151s > >>sys 0m11.727s > >> > >># echo never> /sys/kernel/mm/transparent_hugepage/enabled > >># echo never> /sys/kernel/mm/transparent_hugepage/khugepaged/enabled > >># time x264 --crf 20 --quiet crowd_run_2160p.y4m -o /dev/null --threads 2 > >>yuv4mpeg: 3840x2160@50/1fps, 1:1 > >> > >>encoded 500 frames, 0.66 fps, 251812.80 kb/s > >> > >>real 12m37.962s > >>user 21m13.506s > >>sys 0m11.696s > >> > >>Just 2.7%, even though the working set was much larger. > >Did you make sure to check your stddev on those? > > I'm doing another run to look at variability. Sigh. Could you please stop using stone-age tools like /usr/bin/time and instead use: perf stat --repeat 3 x264 ... you can install it via: cd linux cd tools/perf/ make -j install That way you will see 'variability' (sttdev/error bars/fuzz), and a whole lot of other CPU details beyond much more precise measurements: $ perf stat --repeat 3 x264 --crf 20 --quiet soccer_4cif.y4m -o /dev/null --threads 2 yuv4mpeg: 704x576@60/1fps, 128:117 encoded 2 frames, 23.47 fps, 39824.64 kb/s yuv4mpeg: 704x576@60/1fps, 128:117 encoded 2 frames, 23.52 fps, 39824.64 kb/s yuv4mpeg: 704x576@60/1fps, 128:117 encoded 2 frames, 23.45 fps, 39824.64 kb/s Performance counter stats for 'x264 --crf 20 --quiet soccer_4cif.y4m -o /dev/null --threads 2' (3 runs): 130.624286 task-clock-msecs # 1.496 CPUs ( +- 0.081% ) 74 context-switches # 0.001 M/sec ( +- 7.151% ) 3 CPU-migrations # 0.000 M/sec ( +- 25.000% ) 2987 page-faults # 0.023 M/sec ( +- 0.162% ) 389234822 cycles # 2979.804 M/sec ( +- 0.081% ) 481360693 instructions # 1.237 IPC ( +- 0.036% ) 4206296 cache-references # 32.201 M/sec ( +- 0.387% ) 55732 cache-misses # 0.427 M/sec ( +- 0.529% ) 0.087336553 seconds time elapsed ( +- 0.100% ) Note that perf stat will run fine on older [pre-2.6.31] kernels too (it will measure elapsed time) and even there it will be much more precise than /usr/bin/time. For more dTLB details, use something like: perf stat -e cycles -e instructions -e dtlb-loads -e dtlb-load-misses --repeat 3 x264 ... Yes, i know we had a big flamewar about perf kvm, but IMHO that is no reason for you to pretend that this tool doesnt exist ;-) > > I'm also curious how it compares for --preset ultrafast and so forth. > > Is this something realistic or just a benchmark thing? I'd suggest for you to use the default settings, to make it realistic. (Maybe also 'advanced/high-quality' settings that an advanced user would utilize.) It is no doubt that benchmark advantages can be shown - the point of this exercise is to show that there are real-life speedups to various categories of non-server apps that hugetlb gives us. Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>