On Mon, Oct 07, 2019 at 11:26:35AM -0700, Linus Torvalds wrote: > But on x86, if we move the STAC/CLAC out of the low-level copy > routines and into the callers, we'll have a _lot_ of churn. I thought > it would be mostly a "teach objtool" thing, but we have lots of > different versions of it. Not just the 32-bit vs 64-bit, it's embedded > in all the low-level asm implementations. > > And we don't want the regular "copy_to/from_user()" to then have to > add the STAC/CLAC at the call-site. So then we'd want to un-inline > copy_to_user() entirely. For x86? Sure, why not... Note, BTW, that for short constant-sized copies we *do* STAC/CLAC at the call site - see those __uaccess_begin_nospec(); in raw_copy_{from,to}_user() in the switches... > Which all sounds like a really good idea, don't get me wrong. I think > we inline it way too aggressively now. But it'sa _big_ job. > > So we probably _should_ > > - remove INLINE_COPY_TO/FROM_USER > > - remove all the "small constant size special cases". > > - make "raw_copy_to/from_user()" have the "unsafe" semantics and make > the out-of-line copy in lib/usercopy.c be the only real interface > > - get rid of a _lot_ of oddities Not that many, really. All we need is a temporary cross-architecture __uaccess_begin_nospec(), so that __copy_{to,from}_user() could have that used, instead of having it done in (x86) raw_copy_..._...(). Other callers of raw_copy_...() would simply wrap it into user_access_begin()/ user_access_end() pairs; this kludge is needed only in __copy_from_user() and __copy_to_user(), and only until we kill their callers outside of arch/*. Which we can do, in a cycle or two. _ANY_ use of that temporary kludge outside of those two functions will be grepped for and LARTed into the ground. > I hope you prove me wrong. But I'll look at a smaller change to just > make x86 use the current special copy loop (as > "unsafe_copy_to_user()") and have everybody else do the trivial > wrapper. > > Because we definitely should do that cleanup (it also fixes the whole > "atomic copy in kernel space" issue that you pointed to that doesn't > actually want STAC/CLAC at all), but it just looks fairly massive to > me. AFAICS, it splits nicely. 1) cross-architecture user_access_begin_dont_use(): on everything except x86 it's empty, on x86 - __uaccess_begin_nospec(). 2) stac/clac lifted into x86 raw_copy_..._user() out of copy_user_generic_unrolled(), copy_user_generic_string() and copy_user_enhanced_fast_string(). Similar lift out of __copy_user_nocache(). 3) lifting that thing as user_access_begin_dont_use() into __copy_..._user...() and as user_access_begin() into other generic callers, consuming access_ok() in the latter. __copy_to_user_inatomic() can die at the same stage. 4) possibly uninlining on x86 (and yes, killing the special size handling off). We don't need to touch the inlining decisions for any other architectures. At that point raw_copy_to_user() is available for e.g. readdir.c to play with. And up to that point only x86 sees any kind of code changes, so we don't have to worry about other architectures. 5) kill the __copy_...() users outside of arch/*, alone with quite a few other weird shits in there. A cycle or two, with the final goal being to kill the damn things off. 6) arch/* users get arch-by-arch treatment - mostly it's sigframe handling. Won't affect the generic code and would be independent for different architectures. Can happen in parallel with (5), actually. 7) ... at that point user_access_begin_dont_user() gets removed and thrown into the pile of mangled fingers of those who'd ignored all warnings and used it somewhere else. I don't believe that (5) would be doable entirely in this cycle, but quite a few bits might be. On a somewhat related note, do you see any problems with void *copy_mount_options(const void __user * data) { unsigned offs, size; char *copy; if (!data) return NULL; copy = kmalloc(PAGE_SIZE, GFP_KERNEL); if (!copy) return ERR_PTR(-ENOMEM); offs = (unsigned long)untagged_addr(data) & (PAGE_SIZE - 1); if (copy_from_user(copy, data, PAGE_SIZE - offs)) { kfree(copy); return ERR_PTR(-EFAULT); } if (offs) { if (copy_from_user(copy, data + PAGE_SIZE - offs, offs)) memset(copy + PAGE_SIZE - offs, 0, offs); } return copy; } on the theory that any fault halfway through a page means a race with munmap/mprotect/etc. and we can just pretend we'd lost the race entirely. And to hell with exact_copy_from_user(), byte-by-byte copying, etc.