On Mon, Feb 15, 2016 at 10:31:34AM -0600, Josh Poimboeuf wrote: > On Fri, Feb 12, 2016 at 09:10:11PM +0100, Peter Zijlstra wrote: > > On Fri, Feb 12, 2016 at 12:32:06PM -0600, Josh Poimboeuf wrote: > > > What I actually see in the listing is: > > > > > > decl __percpu_prefix:__preempt_count > > > je 1f: > > > .... > > > 1: > > > call ___preempt_schedule > > > > > > So it puts the "call ___preempt_schedule" in the slow path. > > > > Ah yes indeed. Same difference though. > > > > > I also don't see how that would be related to the use of the asm > > > statement in the __preempt_schedule() macro. Doesn't the use of > > > unlikely() in preempt_enable() put the call in the slow path? > > > > Sadly no, unlikely() and asm_goto don't work well together. But the slow > > path or not isn't the reason we do the asm call thing. > > > > > #define preempt_enable() \ > > > do { \ > > > barrier(); \ > > > if (unlikely(preempt_count_dec_and_test())) \ > > > preempt_schedule(); \ > > > } while (0) > > > > > > Also, why is the thunk needed? Any reason why preempt_enable() can't be > > > called directly from C? > > > > That would make the call-site save registers and increase the size of > > every preempt_enable(). By using the thunk we can do callee saved > > registers and avoid blowing up the call site. > > So is the goal to optimize for size? General performance impact of preempt_enable(). > If I replace the calls to > __preempt_schedule[_notrace]() with real C calls and remove the thunks, > it only adds about 2k to vmlinux. That's less than I had expected, but probably still worth it. And is that added text purely in the slow path? We really want to avoid putting any more register pressure on the preempt_enable() call sites. The single memop and Jcc is about as fast we can get and we spend quite a bit of effort getting there. > There are two ways to fix the warnings: > > 1. get rid of the thunks and call the C functions directly; or > > 2. add the stack pointer to the asm() statement output operand list to > ensure a stack frame gets created in the caller function before the > call. (Note this still allows the thunks to do callee saved registers.) > > I like #1 better, but maybe I'm still missing the point of the thunks. Ingo, Linus? -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html