On Fri, Mar 18, 2022 at 06:28:37PM +0100, Peter Zijlstra wrote: > > Related to this, I don't see anything in arch/x86/kernel/static_call.c that > > limits this code to x86-64: > > > > if (func == &__static_call_return0) { > > emulate = code; > > code = &xor5rax; > > } > > > > > > On 32-bit, it will be patched as "dec ax; xor eax, eax" or something like > > that. Fortunately it doesn't corrupt any callee-save register but it is not > > just a bit funky, it's also not a single instruction. > > Urggghh.. that's fairly yuck. So there's two options I suppose: > > 0x66, 0x66, 0x66, 0x31, 0xc0 Argh, that turns into: xorw %ax, %ax. Let me see if there's another option. > Which is a tripple prefix xor %eax, %eax, which, IIRC should still clear > the whole 64bit on 64bit and *should* still not trigger the prefix > decoding penalty some frontends have (which is >3 IIRC). > > Or we can emit: > > 0xb8, 0x00, 0x00, 0x00, 0x00 > > which decodes to: mov $0x0,%eax, which is less efficient in some > front-ends since it doesn't always get picked up in register rename > stage. > >