On Tue, Feb 3, 2009 at 8:32 AM, Dominik 'Rathann' Mierzejewski <dominik@xxxxxxxxxxxxxx> wrote: >> Do you have benchmarks that show given a constant -mtune that >> -march=i686 makes a material difference for any significant userspace >> apps vs -march=i586, if not why are you being so insistent and >> damning of the current compatible behavior? > > Which applications do you suggest for testing this hypothesis? Most easily measured things (rsvg rendering, freetype) probably aren't ever performance critical for typical users. I think the codec idea is good both on the 'likely to benefit from cmov' perspective as well as 'performance actually matters' perspective, but it ought to be things the Fedora ships. Although, for video the video driver (and hardware YUV->RGB) is probably more important than codec speed. Ideally the test ought to be done on something modern that can't do x86_64 (low end atom perhaps), since if you're on x86_64 you ought to be using the x86_64 distro or suffering whatever performance you don't get... but I don't have anything x86 handy. So, libtheora decoding HD video. libTheora-i586 5.872 libTheora-i686 5.86 libTheora-x86_64 5.643 libTheora-i586(no asm) 9.396 libTheora-i686(no asm) 9.142 libTheora-x86_64(no asm) 8.04 So, we learn— If you want performance for codecs use hand coded assembly or at least x86_64. :) For the with assembly version the improvement is 0.2%, without asm, 2.7%. I'm a bit disappointed: This hasn't shown a real world improvement (0.2% isn't helpful), but it suggests that one might be possible for some other application. Can someone suggest something else which is performance relevant for many users? ---Method disclosure--- Video is http://community.elphel.com/videos/1920x1072_24FPS.ogg [gmaxwell@sonolumen libtheora-1.0]$ CFLAGS='-m32 -march=i586 -mtune=core2' ./configure --target=i586 ; make clean ; make -j5 [gmaxwell@sonolumen examples]$ (for i in `seq 1 10` ; do (time -p ./dump_video < 1920x1072_24FPS.ogg > /dev/null) 2>&1 | grep 'user' ; done ) | awk '{sum+=$2} END { print sum/NR}' 5.872 [gmaxwell@sonolumen libtheora-1.0]$ CFLAGS='-m32 -march=i686 -mtune=core2' ./configure --target=i686 ; make clean ; make -j5 [gmaxwell@sonolumen examples]$ (for i in `seq 1 10` ; do (time -p ./dump_video < 1920x1072_24FPS.ogg > /dev/null) 2>&1 | grep 'user' ; done ) | awk '{sum+=$2} END { print sum/NR}' 5.86 [gmaxwell@sonolumen libtheora-1.0]$ CFLAGS='-m32 -march=i686 -mtune=core2' ./configure --disable-asm ; make clean ; make -j5 [gmaxwell@sonolumen examples]$ (for i in `seq 1 10` ; do (time -p ./dump_video < 1920x1072_24FPS.ogg > /dev/null) 2>&1 | grep 'user' ; done ) | awk '{sum+=$2} END { print sum/NR}' 9.142 [gmaxwell@sonolumen libtheora-1.0]$ CFLAGS='-m32 -march=i586 -mtune=core2' ./configure --disable-asm ; make clean ; make -j5 [gmaxwell@sonolumen examples]$ (for i in seq 1 10 ; do (time -p ./dump_video < 1920x1072_24FPS.ogg > /dev/null) 2>&1 | grep 'user' ; done ) | awk '{sum+=$2} END { print sum/NR}' 9.396 [gmaxwell@sonolumen libtheora-1.0]$ CFLAGS='-mtune=core2' ./configure --disable-asm ; make clean ; make -j5 [gmaxwell@sonolumen examples]$ (for i in `seq 1 10` ; do (time -p ./dump_video < 1920x1072_24FPS.ogg > /dev/null) 2>&1 | grep 'user' ; done ) | awk '{sum+=$2} END { print sum/NR}' 8.04 [gmaxwell@sonolumen libtheora-1.0]$ CFLAGS='-mtune=core2' ./configure ; make clean ; make -j5 [gmaxwell@sonolumen examples]$ (for i in `seq 1 10` ; do (time -p ./dump_video < 1920x1072_24FPS.ogg > /dev/null) 2>&1 | grep 'user' ; done ) | awk '{sum+=$2} END { print sum/NR}' 5.643 -- fedora-devel-list mailing list fedora-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-devel-list