On Jul 12, 2015 12:06 PM, "Mathieu Desnoyers" <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > > Expose a new system call allowing threads to register a userspace memory > area where to store the current CPU number. Scheduler migration sets the > TIF_NOTIFY_RESUME flag on the current thread. Upon return to user-space, > a notify-resume handler updates the current CPU value within that > user-space memory area. > > This getcpu cache is an alternative to the sched_getcpu() vdso which has > a few benefits: > - It is faster to do a memory read that to call a vDSO, > - This cache value can be read from within an inline assembly, which > makes it a useful building block for restartable sequences. > Let's wait and see what the final percpu atomic solution is. If it involves percpu segments, then this is unnecessary. Also, this will need to be rebased onto -tip, and that should wait until the big exit rewrite is farther along. > This approach is inspired by Paul Turner and Andrew Hunter's work > on percpu atomics, which lets the kernel handle restart of critical > sections: > Ref.: > * https://lkml.org/lkml/2015/6/24/665 > * https://lwn.net/Articles/650333/ > * http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdf > > Benchmarking sched_getcpu() vs tls cache approach. Getting the > current CPU number: > > - With Linux vdso: 12.7 ns This is a bit unfair, because the glibc wrapper sucks and the __vdso_getcpu interface is overcomplicated. We can fix it with a better API. It won't make it *that* much faster, though. > > diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c > index e504246..157cec0 100644 > --- a/arch/x86/kernel/signal.c > +++ b/arch/x86/kernel/signal.c > @@ -750,6 +750,8 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags) > if (thread_info_flags & _TIF_NOTIFY_RESUME) { > clear_thread_flag(TIF_NOTIFY_RESUME); > tracehook_notify_resume(regs); > + if (getcpu_cache_active(current)) > + getcpu_cache_handle_notify_resume(current); We need to disentangle this stuff. This is buried way too deeply here. Fortunately, do_notify_resume is going away. It's already unused on 64-bit kernels in -next. > +/* > + * This resume handler should always be executed between a migration > + * triggered by preemption and return to user-space. > + */ > +void getcpu_cache_handle_notify_resume(struct task_struct *t) > +{ > + int32_t __user *gcp = t->getcpu_cache; > + > + if (gcp == NULL) > + return; > + if (unlikely(t->flags & PF_EXITING)) > + return; > + /* > + * access_ok() of gcp_user has already been checked by > + * sys_getcpu_cache(). > + */ > + if (__put_user(raw_smp_processor_id(), gcp)) > + force_sig(SIGSEGV, current); We're preemptible here, although I think it's okay. But I'd at least clear the getcpu_cache state if __put_user fails, because otherwise it's not entirely obvious to me that we can't infinite loop. > +/* > + * sys_getcpu_cache - setup getcpu cache for caller thread > + */ > +SYSCALL_DEFINE2(getcpu_cache, int32_t __user *, gcp, int, flags) > +{ > + if (flags) > + return -EINVAL; > + if (gcp != NULL && !access_ok(VERIFY_WRITE, gcp, sizeof(int32_t))) > + return -EFAULT; > + current->getcpu_cache = gcp; > + /* Will update *gcp on resume */ > + if (gcp) > + set_thread_flag(TIF_NOTIFY_RESUME); > + return 0; > +} IMO this is impolite. If the pointer is bad, we should return -EFAULT rather than sending SIGSEGV. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html