On Fri, May 12, 2017 at 11:57 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > First, some stats: there's a thousand-odd callers of __get_user(). Out of > those, about 70% are in arch/, mostly in sigframe-related code. Sure. And they can be trivially converted, and none of them should care at all. > IOW, we have > * most of users in arch/* (heavily dominated by signal-related code, > both loads and stores). Those need careful massage; maybe unsafe-based > solution, maybe something else, but it's obviously per-architecture work > and these paths are sensitive. Why are they sensitive? Why not just do this: git grep -l '\<__\(\(get\)\|\(put\)\)_user(' -- arch/x86 :^arch/x86/include/asm/uaccess.h | xargs sed -i 's/__\(\(\(get\)\|\(put\)\)_user(\)/\1/g' which converts all the x86 uses in one go. Anybody who *relies* on not checking the address_limit is so broken as to be not even funny. And anything that is so performance-sensitive that anybody can even measure the effect of the above we can convert later. Sure, do it in pieces (eg each architecture separately, then "drivers", then "networking", then whatever, until all done), just so that *if* something actually depends on semantics (and that really shouldn't be the case), we have at least some information from a bisect. But I don't see the excuse for not just doing it. If nobody notices, it's an obvious improvement. And if somebody *does* notice, we know how to do it properly with unsafe_xyz_user(), because "__xyz_user()" most definitely isn't it. An example of something that *should* be converted is "csum_partial_copy_from_user()", but it really does need to use "user_access_begin()" and friends, because right now it's using stac/clac for each 16-bit word. That's *expensive*, and it's expensive whether you use __get_user() or get_user(). Adding x86 people just to see how they react to the above "patch". For me, in my fairly minimal personal config, the above cleanup patch shrinks the text size of the resulting kernel by 1.7kB. Yes, most of it is the out-of-line code, but still.. Interestingly, the signal handling code didn't change at all. It looks like only the 32-bit code uses __put_user() for the frame setup. The regular code uses put_user_try/put_user_catch, which is the x86-specific early try at the unsafe versions, but it would actually be improved by using "unsafe_put_user()" and my patch to make that use "asm goto". Linus PS. That "patch" depends on modern git - with older versions of git you need to do the path negation with ":!pattern", and then you need to quote it too since '!' is special for shell.