From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Date: Wed, 23 May 2012 11:27:00 -0700 > It's not faster to just do something like > > int byte = 4; > > #if CONFIG_64BIT > byte = 8; > if (has_zero_32bit(value >> 32)) { > value >>= 32; > byte = 4; > } > #endif > if (has_zero_16(value >> 16)) { > value >>= 16; > byte -= 2; > } > if (!value & 0xff00) > byte--; > return byte; > > which looks like it might generate ok code? It might be, I'll play around with it. FWIW, when I code this end case in assembler on sparc64 I just go for a bunch of conditional moves, so I'll try to come up with something similar to the above that gcc will emit reasonably. > Btw, when benchmarking, make sure that your branches do not predict > well. Because in real life they won't predict well. So you can't > benchmark the mask->byte function with some well-behaved input that > commonly returns the same value. Indeed, and that's why I'd prefer it if gcc were to emit conditional moves :-) >> For reference here is the final version of the sparc commit, it works >> and I've been running tests on it since last night. I'm extrmely >> confident the C code will work on any big-endian machine. > > Umm. Except your "top of address space" thing is entirely sparc-specific. Although a bit more expensive than what you can do on the x86 side with these tests, I think my code should work. Comparing get_fs() with USER_DS should be portable enough, as should STACK_TOP, right? Or does STACK_TOP have some weird semantics on some architectures that I'm not aware of? The only other thing is how we are using ~0UL as the limit for the kernel, and for all practical purposes that ought to be fine too. -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html