Oops. *Really* adding the x86 guys now. Blush. Linus On Sat, May 13, 2017 at 9:15 AM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > On Fri, May 12, 2017 at 11:57 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: >> >> First, some stats: there's a thousand-odd callers of __get_user(). Out of >> those, about 70% are in arch/, mostly in sigframe-related code. > > Sure. And they can be trivially converted, and none of them should care at all. > >> IOW, we have >> * most of users in arch/* (heavily dominated by signal-related code, >> both loads and stores). Those need careful massage; maybe unsafe-based >> solution, maybe something else, but it's obviously per-architecture work >> and these paths are sensitive. > > Why are they sensitive? > > Why not just do this: > > git grep -l '\<__\(\(get\)\|\(put\)\)_user(' -- arch/x86 > :^arch/x86/include/asm/uaccess.h > | xargs sed -i 's/__\(\(\(get\)\|\(put\)\)_user(\)/\1/g' > > which converts all the x86 uses in one go. > > Anybody who *relies* on not checking the address_limit is so broken as > to be not even funny. And anything that is so performance-sensitive > that anybody can even measure the effect of the above we can convert > later. > > Sure, do it in pieces (eg each architecture separately, then > "drivers", then "networking", then whatever, until all done), just so > that *if* something actually depends on semantics (and that really > shouldn't be the case), we have at least some information from a > bisect. > > But I don't see the excuse for not just doing it. If nobody notices, > it's an obvious improvement. And if somebody *does* notice, we know > how to do it properly with unsafe_xyz_user(), because "__xyz_user()" > most definitely isn't it. > > An example of something that *should* be converted is > "csum_partial_copy_from_user()", but it really does need to use > "user_access_begin()" and friends, because right now it's using > stac/clac for each 16-bit word. That's *expensive*, and it's expensive > whether you use __get_user() or get_user(). > > Adding x86 people just to see how they react to the above "patch". > > For me, in my fairly minimal personal config, the above cleanup patch > shrinks the text size of the resulting kernel by 1.7kB. Yes, most of > it is the out-of-line code, but still.. > > Interestingly, the signal handling code didn't change at all. It looks > like only the 32-bit code uses __put_user() for the frame setup. The > regular code uses put_user_try/put_user_catch, which is the > x86-specific early try at the unsafe versions, but it would actually > be improved by using "unsafe_put_user()" and my patch to make that use > "asm goto". > > Linus > > PS. That "patch" depends on modern git - with older versions of git > you need to do the path negation with ":!pattern", and then you need > to quote it too since '!' is special for shell.