Linus Torvalds wrote: > > Here's the plain "linus" baseline (ie the "Do register rotation in cpp") > thing, with the fixed "E += TEMP .." thing): > linus 0.4018 151.9 > and here it is with your patch: > linus 0.4653 131.2 > (ok, so the numbers aren't horribly stable, but the "plain linus" thing > consistently outperforms here - and underperforms with your patch). Well, I'd be surprised if one C version would always be the winner on every single cpu; that 13% loss[1] I think would be an acceptable compromise, if the goal is to have one implementation that does reasonably well on all cpus. That's why i asked how the change did on nehalem; if it's a measurable loss on anything modern (core2+), then of course the P4s must suffer; and one could always blame the compiler ;) It's not like the difference in sha1 overhead will be noticeable in normal git use. artur [1] I suspect the old gcc is a factor (4.0.4 does <100M/s here). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html