On Fri, Aug 19, 2016 at 12:10:02PM -0700, Vineet Gupta wrote: > Al reported potential issue with ARC get_user() as it wasn't clearing > out destination pointer in case of fault due to bad address etc. Applied to my branch with other similar fixes. FWIW, there's another interesting question in the same general area - __get_user() callers tend to be on hot paths and they clump a _lot_. That's the original reasoning behind the __-variants; doing access_ok() once for the entire bunch rather then repeating it for every single call. However, access_ok() is not the only problem. Testing if an error has happened and conditional branching can also be sensitive; moreover, on recent x86 SMAP-related setup/teardown is costly as hell. The latter problem is solved by bracketing the entire series of accesses with a single setup/teardown pair (uaccess_begin()/uaccess_end()) and using unsafe_get_user()/unsafe_put_user() between those. The former has spawned a bunch of solutions: * pretty much arch-independent optimization - use err |= __get_user() instead of if (__get_user() != 0) goto fail. We drop branching, but we still get a plenty of crap. * x86-only get_user_ex(). Does *not* return anything, uses per-process flag to indicate errors, the entire sequence is bracketed by uaccess_try() and uaccess_catch(err), the latter dumps the flag into err. Pairing of brackets is enforced - expansion of uaccess_try() contains do { and uaccess_catch() - } while (0). Can't mix any userland access other than {get,put}_user_ex/unsafe_{get,put}_user into the series - AC flag will be buggered. In particular, any use of __copy_{to,from}_user() is a bug there. * somewhat similar, __get_user_err(v, p, err) on assorted architectures that are less register-starved than x86 is. Those are equivalent to if (__get_user(v, p)) err = -EFAULT; and translate into something along the lines of in .text: 1: reg = *p; 2: v = (__typeof(*p))reg; in .text.fixup: fixup(1): reg = 0; err = -EFAULT; goto 2; That gives a branch-free path in the normal case, with fixups done out-of-line. get_user_ex() is similar, except that it uses a field in current_thread_info() where those use a local variable. No bracketing needed - only access_ok() before going there. About a half of __get_user() callers are in arch/*, mostly in sigreturn(2) and friends. For those the use of arch-specific primitives is OK. However, there's another big pile in assorted compat code, and that obviously isn't OK with arch-specific stuff. I realize that asking such questions can very easily devolve into bikeshedding, with a bunch of "only x86 matters anyway" thrown in, but... it would be nice to come up with a syntax that could be used in arch-independent places. I toyed with things like uaccess_begin(); ... get_user_ex(v, p, err); ... put_user_ex(v, q, err); ... copy_from_user_ex(&s, r, err); ... copy_to_user_ex(&s, r, err); ... copy_in_user_ex(t, r, err); ... uaccess_check(err); ... err |= sanity_check(...); // returns 0 or -EFAULT ... uaccess_end(err); with x86 basically ignoring err in ..._ex() primitives and doing err |= current_thread_info()->flag; in uaccess_end()/uaccess_check(), while something that currently has __get_user_err() et.al. mapping get_user_ex() to it and making uaccess_{begin,end,check} no-ops, but that's pretty much a mechanical merge of those variants and none too pretty, at that. Suggestions? -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html