On 02/13/2017 03:06 PM, hpa@xxxxxxxxx wrote: > On February 13, 2017 2:53:43 AM PST, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >>> That way we'd end up with something like: >>> >>> asm(" >>> push %rdi; >>> movslq %edi, %rdi; >>> movq __per_cpu_offset(,%rdi,8), %rax; >>> cmpb $0, %[offset](%rax); >>> setne %al; >>> pop %rdi; >>> " : : [offset] "i" (((unsigned long)&steal_time) + offsetof(struct >> steal_time, preempted))); >>> And if we could get rid of the sign extend on edi we could avoid all >> the >>> push-pop nonsense, but I'm not sure I see how to do that (then again, >>> this asm foo isn't my strongest point). >> Maybe: >> >> movsql %edi, %rax; >> movq __per_cpu_offset(,%rax,8), %rax; >> cmpb $0, %[offset](%rax); >> setne %al; >> >> ? > We could kill the zero or sign extend by changing the calling interface to pass an unsigned long instead of an int. It is much more likely that a zero extend is free for the caller than a sign extend. I have thought of that too. However, the goal is to eliminate memory read/write from/to stack. Eliminating a register sign-extend instruction won't help much in term of performance. Cheers, Longman