On Mon, Feb 13, 2017 at 12:06:44PM -0800, hpa@xxxxxxxxx wrote: > >Maybe: > > > >movsql %edi, %rax; > >movq __per_cpu_offset(,%rax,8), %rax; > >cmpb $0, %[offset](%rax); > >setne %al; > > > >? > > We could kill the zero or sign extend by changing the calling > interface to pass an unsigned long instead of an int. It is much more > likely that a zero extend is free for the caller than a sign extend. Right, Boris and me talked about that on IRC. I was wondering if the argument was u32 if we could assume the top 32 bits are 0 and then use rdi without prior movzx. That would allow reducing the thing one more instruction. Also, PVOP_CALL_ARG#() have an (unsigned long) cast in them that doesn't make sense. That cast ends up resulting in the calling code doing explicit sign or zero extends into the full 64bit register for no good reason. If one removes that cast things still compile, but I worry something somehow relies on this weird behaviour and will come apart.