* Linus Torvalds (torvalds@xxxxxxxxxxxxxxxxxxxx) wrote: > > > On Mon, 15 Jun 2009, Ingo Molnar wrote: > > > > The gist of it is the replacement of iret with this open-coded > > sequence: > > > > +#define NATIVE_INTERRUPT_RETURN_NMI_SAFE pushq %rax; \ > > + movq %rsp, %rax; \ > > + movq 24+8(%rax), %rsp; \ > > + pushq 0+8(%rax); \ > > + pushq 16+8(%rax); \ > > + movq (%rax), %rax; \ > > + popfq; \ > > + ret > > That's an odd way of writing it. > There were a few reasons (maybe not all good) for writing it like this : - Saving I$ (as it is placed close to hot entry.S code paths) - Staying localized with the top of stack, saving D$ accesses. But maybe benchmarks will prove my approach overkill, dunno. Also we have to be aware that the CPU might behave more slowly in the presence of unbalanced int/iret, call/ret. I think we should benchmark your approach to make sure jmp will not produce such slowdown. But it might well be faster, and it's definitely clearer. Thanks, Mathieu > Don't we have a per-cpu segment here? I'd much rather just see it do > something like this (_before_ restoring the regular registers) > > movq EIP(%esp),%rax > movq ESP(%esp),%rdx > movq %rax,gs:saved_esp > movq %rdx,gs:saved_eip > > # restore regular regs > RESTORE_ALL > > # skip eip/esp to get at eflags > addl $16,%esp > popfq > > # restore rsp/rip > movq gs:saved_esp,%rsp > jmpq *(gs:saved_eip) > > but I haven't thought deeply about it. Maybe there's something wrong with > the above. > > > If it's faster, this becomes a legit (albeit complex) > > micro-optimization in a _very_ hot codepath. > > I don't think it's all that hot. It's not like it's the return to user > mode. > > Linus -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html