On Fri, May 12, 2017 at 06:00:44PM -0700, Linus Torvalds wrote: > So I should have asked earlier, but I was feeling rather busy during > the early merge window.. > So I'm actually more interested to hear if you have any pending work > on cleaning up the __get/put_user() mess we have. Is that on your > radar at all? Yes, it is. > In particular, right now, both __get_user() and __put_user() are nasty > and broken interfaces. > > It *used* to be that they were designed to just generate a single > instruction. That's not what they currently do at all, due to the > whole SMAP/PAN on x86 and arm. > > For example, on x86, right now a single __put_user() call ends up > generating something like [snip] > But even that small thing is rather debatable from a performance > angle: the actual cost of __put_user() these days is not the function > call, but the clac/stac (on modern x86) and the PAN bit games (on > arm). > > So I'm actually inclined to just say "We should make > __get_user/__put_user() just be aliases for the normal > get_user/put_user() model". > which is more boilerplate, but ends up generating much better code. > And for "unsafe_put_user()" in particular, we actually could teach gcc > to use "asm goto" to really improve code generation. I have patches > for that if you want to look. > > I'm attaching an example patch for "filldir()" that does that modern > thing. It almost had the right form as-is anyway, and under some loads > (eg "find") filldir actually shows up in profiles too.\ > But the *first* thing I'd like to do would be to just get rid of > __get_user/__put_user as a thing. It really does generate nasty code, > and we might as well just make it do that function call. > > Because what we do now isn't right. If we care about performance, the > "__xyz_user" versions are wrong (use unsafe_xyz_user() instead). And > if you don't care about performance, you should use the regular > xyz_user() functions that do an ok job by just calling the right-sized > helper function instead of inlining the crud. > > Hmm? First, some stats: there's a thousand-odd callers of __get_user(). Out of those, about 70% are in arch/, mostly in sigframe-related code. Take a look at the output of git grep -n -w __get_user|grep -v '^arch' | sed -e 's/:.*//'|uniq -c|sort -nr 55 drivers/gpu/drm/drm_ioc32.c 43 drivers/staging/comedi/comedi_compat32.c 35 kernel/compat.c 27 net/compat.c 27 block/compat_ioctl.c 15 drivers/usb/core/devio.c 13 drivers/char/ipmi/ipmi_devintf.c 11 kernel/signal.c 10 fs/fcntl.c 8 ipc/compat.c 8 drivers/video/fbdev/sbuslib.c 7 net/socket.c 7 drivers/gpu/drm/mga/mga_ioc32.c 6 fs/select.c 6 drivers/tty/vt/vt_ioctl.c 5 drivers/tty/vt/selection.c 5 drivers/spi/spidev.c 5 drivers/pci/proc.c 4 kernel/ptrace.c 4 ipc/compat_mq.c 4 drivers/tty/vt/consolemap.c 3 sound/oss/sys_timer.c 3 drivers/media/usb/uvc/uvc_v4l2.c 3 drivers/macintosh/ans-lcd.c and then we have a smallish bunch of files with one or two callers. For __put_user() we have ~1800 callers total, ~1100 of them in arch/* and the rest goes like this: 73 drivers/gpu/drm/drm_ioc32.c 66 ipc/compat.c 58 drivers/gpu/drm/radeon/radeon_ioc32.c 55 block/compat_ioctl.c 52 kernel/compat.c 49 kernel/signal.c 43 drivers/staging/comedi/comedi_compat32.c 31 drivers/gpu/drm/r128/r128_ioc32.c 30 drivers/gpu/drm/mga/mga_ioc32.c 27 fs/signalfd.c 25 fs/readdir.c 24 fs/statfs.c 19 kernel/sys.c 17 net/compat.c 11 drivers/scsi/sg.c 10 fs/fcntl.c 8 sound/core/pcm_native.c 8 fs/binfmt_flat.c 7 fs/binfmt_elf_fdpic.c 6 net/socket.c 6 drivers/char/ipmi/ipmi_devintf.c 5 sound/oss/sys_timer.c 5 fs/binfmt_elf.c 5 drivers/video/fbdev/sbuslib.c 5 drivers/tty/vt/consolemap.c 5 drivers/spi/spidev.c 5 drivers/pci/proc.c 4 kernel/ptrace.c 4 ipc/compat_mq.c 3 fs/select.c 3 drivers/tty/vt/vt.c ... IOW, we have * most of users in arch/* (heavily dominated by signal-related code, both loads and stores). Those need careful massage; maybe unsafe-based solution, maybe something else, but it's obviously per-architecture work and these paths are sensitive. * few places outside of arch/* where these are abused; absolute majority are in ioctl compat code and they are _bad_. Bad on x86, bad on s390, etc. I'm not sure if switch to unsafe is the right solution for those, actually - depends on how we end up dealing with compat ioctls of that sort. Might be better to do bulk copy to/from userland, combined with conversion from 32bit to native kernel-side and passing pointers to kernel objects to functions doing actual work. Really depends upon the details. * some places in fairly common codepaths where __get_user() and __put_user() are being played with. And I certainly agree that they are not good. But IMO the first step is not to convert __get_user()/__put_user() to aliases of get_user()/put_user() - it's to get rid of their infestations outside of arch/*. They are concentrated in relatively few places, so we can deal with those individually and then just fucking ban their use outside of arch/*. That's easy enough to watch for - simple git grep will do, so anything creeping back will be immediately spotted. In -next, with subsequent explanation of the reasons Not To Do That Or Else(tm) to the people responsible. As for fs/readdir.c... 4 back-to-back stores like if (__put_user(ino, &dirent->d_ino)) goto efault; if (__put_user(0, &dirent->d_off)) goto efault; if (__put_user(reclen, &dirent->d_reclen)) goto efault; if (__put_user(d_type, &dirent->d_type)) goto efault; might be asking for copy_to_user(). Maybe that one is too small for that to be a win (on s390 it almost certainly would be a win, judging by what Martin and Heiko are saying); fs/statfs.c ones almost certainly are better off with 'convert, then copy_to_user()' approach. Another couple of places worth looking into are copy_siginfo_to_user() and signalfd_copyinfo(). Maybe unsafe is the best approach there as well, maybe not, but it's in the 'large enough for copy_to_user() be interesting, not too large for auto variable' range. I certainly want that shit gone from the common paths, no arguments here, but I want to avoid having to crawl through every architecture's sigframe handling first. Fortunately, we have that crap outside of arch/* concentrated in a few places and they are far enough from each other to be dealt with independently. There are interesting interplays with set_fs() stuff, BTW... There will be more than enough crap that _will_ require crawls through arch/* ;-/ I'm plotting that pile right now (basically, what pieces should go into never-rebased stem, so that per-architecture branches had what they need). get_user()/put_user() are part of it, it's just that I don't want to bring arch/*/kernel/signal*.c into the mix ;-/ Re asm-goto - which architectures would be able to use that? IOW, which gcc versions are stable enough in that area? I'd obviously like to see your patches...