On Fri, Dec 21, 2018 at 11:46:16PM +0300, Cyrill Gorcunov wrote: > Cast to unsigned char is needed in any case. And as far as I remember > we've been using this multiplication trick for a really long time > in x86 land. I'm out of sources right now but it should be somewhere > in assembly libs. x86 isn't the only CPU. Some CPUs have slow multiplies but fast shifts. Also loading 0x0101010101010101 into a register may be inefficient on some CPUs.