----- On Jul 12, 2015, at 2:47 PM, Josh Triplett josh@xxxxxxxxxxxxxxxx wrote: > On Sun, Jul 12, 2015 at 02:06:26PM -0400, Mathieu Desnoyers wrote: >> Expose a new system call allowing threads to register a userspace memory >> area where to store the current CPU number. Scheduler migration sets the >> TIF_NOTIFY_RESUME flag on the current thread. Upon return to user-space, >> a notify-resume handler updates the current CPU value within that >> user-space memory area. >> >> This getcpu cache is an alternative to the sched_getcpu() vdso which has >> a few benefits: >> - It is faster to do a memory read that to call a vDSO, >> - This cache value can be read from within an inline assembly, which >> makes it a useful building block for restartable sequences. >> >> This approach is inspired by Paul Turner and Andrew Hunter's work >> on percpu atomics, which lets the kernel handle restart of critical >> sections: >> Ref.: >> * https://lkml.org/lkml/2015/6/24/665 >> * https://lwn.net/Articles/650333/ >> * >> http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdf >> >> Benchmarking sched_getcpu() vs tls cache approach. Getting the >> current CPU number: >> >> - With Linux vdso: 12.7 ns >> - With TLS-cached cpu number: 0.3 ns > > Nice. One comment below about an interesting assumption that needs > confirmation. > >> --- /dev/null >> +++ b/kernel/getcpu-cache.c > [...] >> +void getcpu_cache_handle_notify_resume(struct task_struct *t) >> +{ >> + int32_t __user *gcp = t->getcpu_cache; >> + >> + if (gcp == NULL) >> + return; >> + if (unlikely(t->flags & PF_EXITING)) >> + return; >> + /* >> + * access_ok() of gcp_user has already been checked by >> + * sys_getcpu_cache(). >> + */ >> + if (__put_user(raw_smp_processor_id(), gcp)) >> + force_sig(SIGSEGV, current); >> +} >> + >> +/* >> + * If parent process has a getcpu_cache, the child inherits. Only >> + * applies when forking a process, not a thread. >> + */ >> +void getcpu_cache_fork(struct task_struct *t) >> +{ >> + t->getcpu_cache = current->getcpu_cache; >> +} >> + >> +void getcpu_cache_execve(struct task_struct *t) >> +{ >> + t->getcpu_cache = NULL; >> +} >> + >> +/* >> + * sys_getcpu_cache - setup getcpu cache for caller thread >> + */ >> +SYSCALL_DEFINE2(getcpu_cache, int32_t __user *, gcp, int, flags) >> +{ >> + if (flags) >> + return -EINVAL; >> + if (gcp != NULL && !access_ok(VERIFY_WRITE, gcp, sizeof(int32_t))) >> + return -EFAULT; >> + current->getcpu_cache = gcp; > > So, you store a userspace address, and intentionally only validate it > when initially set, not when used. You clear it on exec, though not on > fork. Could any cases other than exec could make this problematic? In > particular, what about unusual personality flags, such as > ADDR_LIMIT_32BIT or ADDR_LIMIT_3GB? That's an interesting point. Looking at those personalities, I don't think it should be an issue, but just the fact that you raise the question makes me think we should user put_user() rather than __put_user() in the notify_resume handler, just to be on the safe side. It should not be a frequent code path anyway. > >> + /* Will update *gcp on resume */ >> + if (gcp) > > Minor nit: you're using the pointer as a boolean here, but comparing it > to NULL elsewhere; you should be consistent. I'd suggest consistently > using gcp and !gcp, without the comparison to NULL. Good point, fixing, Thanks for the feedback! Mathieu > > - Josh Triplett -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html