On Tue, Feb 3, 2009 at 3:45 PM, Dominik 'Rathann' Mierzejewski <dominik@xxxxxxxxxxxxxx> wrote: >> There are certainly cases where cmov can be faster. Perhaps exclusively >> on older micro architectures (P4s, early Core2, maybe AMD, haven't >> checked). But in general it's no win. > > Well, I talk to people who write hand-optimized assembly and care to > squeeze every cycle out of various CPUs and they say it's definitely > a win. GCC is not a person who writes hand-optimized assembly, yet it is GCC's use of cmov that matters to us. It wouldn't surprise me to find that profile driven use of CMOV works a lot better than the generic case. These people can continue to write their cmov using ASM. If they are doing that kind of tuning work then they are likely also doing SSE detection and can handle switching code variants at run time. > So please, show me some code instead of hand-waving. … I did post benchmarks of libTheora showing a 0.2% gain on core2 from cmov. Perhaps you'd care to benchmark freetype? -- fedora-devel-list mailing list fedora-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-devel-list