>>> On 19.08.12 at 17:15, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote: >> >--- a/arch/x86/include/asm/arch_hweight.h >> >+++ b/arch/x86/include/asm/arch_hweight.h >> >@@ -25,9 +25,14 @@ static inline unsigned int __arch_hweight32(unsigned int w) >> >{ >> > unsigned int res = 0; >> > >> >+#ifdef CONFIG_LTO >> >+ res = __sw_hweight32(w); >> >+#else >> >+ >> > asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT) >> > : "="REG_OUT (res) >> > : REG_IN (w)); >> >+#endif >> >> Isn't this a little to harsh? Rather than not using popcnt at all, why don't >> you just add the necessary clobbers to the asm() in the LTO case? > > gcc lacks the means to declare that a asm uses an external symbol > currently. Ok we could make it visible. But there's no way to make the > special calling convention work anyways, at least not without someone > changing gcc to allow to declare this per function. That's not the point: The point really is that you could allow the alternative regardless of LTO, and just penalize the LTO case by having even the asm clobber the registers that a function call would not preserve. > I'm not sure the optimization is really worth it anyways, hweight should > be uncommon. That's a separate question (but I sort of agree - not sure whether CPU mask weights ever get calculated on hot paths). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html