On Mon, May 15, 2023 at 01:38:51AM -0400, Kent Overstreet wrote: > On Sun, May 14, 2023 at 11:43:25AM -0700, Eric Biggers wrote: > > I think it would also help if the generated assembly had the handling of the > > fields interleaved. To achieve that, it might be necessary to interleave the C > > code. > > No, that has negligable effect on performance - as expected, for an out > of order processor. < 1% improvement. > > It doesn't look like this approach is going to work here. Sadly. I'd be glad to take a look at the code you actually tried. It would be helpful if you actually provided it, instead of just this "I tried it, I'm giving up now" sort of thing. I was also hoping you'd take the time to split this out into a userspace micro-benchmark program that we could quickly try different approaches on. BTW, even if people are okay with dynamic code generation (which seems unlikely?), you'll still need a C version for architectures that you haven't implemented the dynamic code generation for. - Eric