Hi, On Thu, Nov 07, 2013 at 09:59:38PM +0000, Ramsay Jones wrote: > +static inline uint64_t default_bswap64(uint64_t val) > +{ > + return (((val & (uint64_t)0x00000000000000ffULL) << 56) | > + ((val & (uint64_t)0x000000000000ff00ULL) << 40) | > + ((val & (uint64_t)0x0000000000ff0000ULL) << 24) | > + ((val & (uint64_t)0x00000000ff000000ULL) << 8) | > + ((val & (uint64_t)0x000000ff00000000ULL) >> 8) | > + ((val & (uint64_t)0x0000ff0000000000ULL) >> 24) | > + ((val & (uint64_t)0x00ff000000000000ULL) >> 40) | > + ((val & (uint64_t)0xff00000000000000ULL) >> 56)); > +} This got me thinking. To swap 8 bytes this function performs 8 bitwise shifts, 8 bitwise ANDs and 7 bitwise ORs plus uses 8 64bit constants. We could do better than that: static inline uint64_t hacked_bswap64(uint64_t val) { uint64_t tmp = val << 32 | val >> 32; return (((tmp & (uint64_t)0xff000000ff000000ULL) >> 24) | ((tmp & (uint64_t)0x00ff000000ff0000ULL) >> 8) | ((tmp & (uint64_t)0x0000ff000000ff00ULL) << 8) | ((tmp & (uint64_t)0x000000ff000000ffULL) << 24)); } This performs only 6 shifts, 4 ANDs, 4 ORs and uses 4 64bit constants. bswap64ing 1000000000 64bit ints with default_bswap64() compiled with -O2 takes: real 0m1.808s user 0m1.796s sys 0m0.000s The same with hacked_bswap64(): real 0m0.823s user 0m0.816s sys 0m0.000s I doubt that in normal usage git would spend enough time bswap64ing to make this noticeable, but it was a fun micro-optimization on a wet Thursday evening nevertheless :) Best, Gábor -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html