On 09/04/2019 10.08, Rasmus Villemoes wrote: > one could do > > u32 ror32(u32 x, unsigned s) > { > return (x >> (s&31)) | (x << ((32-s)&31)); > } > > to make the shifts always well-defined and also work as expected for s > >= 32... if only gcc recognized that the masking is redundant, so that > its "that's a ror" pattern detection could kick in. Unfortunately, it > seems that the above generates > > 0: 89 f1 mov %esi,%ecx > 2: 89 f8 mov %edi,%eax > 4: f7 d9 neg %ecx > 6: d3 e0 shl %cl,%eax > 8: 89 f1 mov %esi,%ecx > a: d3 ef shr %cl,%edi > c: 09 f8 or %edi,%eax > e: c3 retq > > while without the masking one gets > > 10: 89 f8 mov %edi,%eax > 12: 89 f1 mov %esi,%ecx > 14: d3 c8 ror %cl,%eax > 16: c3 retq Ah, but that's with an ancient gcc 7. With gcc 8, the above pattern is recognized and generates good code, while eliminating UB. I was about to file a gcc bug, but found https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82498 . Rasmus