On Fri, Feb 28, 2025 at 1:20 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > On Thu, Feb 27, 2025 at 03:47:03PM -0800, Bill Wendling wrote: > > For both gcc and clang, crc32 builtins generate better code than the > > inline asm. GCC improves, removing unneeded "mov" instructions. Clang > > does the same and unrolls the loops. GCC has no changes on i386, but > > Clang's code generation is vastly improved, due to Clang's "rm" > > constraint issue. > > > > The number of cycles improved by ~0.1% for GCC and ~1% for Clang, which > > is expected because of the "rm" issue. However, Clang's performance is > > better than GCC's by ~1.5%, most likely due to loop unrolling. > > Also note that the patch > https://lore.kernel.org/r/20250210210741.471725-1-ebiggers@xxxxxxxxxx/ (which is > already enqueued in the crc tree for 6.15) changes "rm" to "r" when the compiler > is clang, to improve clang's code generation. The numbers you quote are against > the original version, right? > Yeah, they were against top-of-tree. -bw