On Sat, Feb 02, 2019 at 05:49:35PM -0500, Chuck Lever wrote: > >> Byte-swapping causes a CPU pipeline bubble on some processors. When > >> a decoder is comparing an on-the-wire value for equality, byte- > >> swapping can be avoided by comparing it directly to a pre-byte- > >> swapped constant value. > > > > Which ones? > > I assume you mean on which processors have I observed CPU cycle > spikes around bswap instructions. Yes. > I've seen this behavior only > on Intel processors of various families. Interesting. In general we should not do separate byte swap instructions on x86, as MOVBE can be used to do a load or store with an included byteswap, and I thought the whole point for that was that they could be handled in the same cycle. In fact https://www.agner.org/optimize/instruction_tables.pdf says that movbe is generally a single cycle instruction. > Would you prefer a different justification for this clean-up? I don't really care about the cleanup, it is just that the explanation goes against conventional wisdom, which is why I was a little surpised. And that is not just the cycles, but also as Trond pointed out that the Linux byte swapping macro on constants should usually be optimized away at compile time anyway.