On Thu, Nov 30, 2017 at 4:46 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > On Thu, Nov 30, 2017 at 05:18:33AM -0800, Christoph Hellwig wrote: >> On Thu, Nov 30, 2017 at 02:07:19AM +0000, Al Viro wrote: >> > Incidentally, grepping for sys_close() shows another piece of fun in >> > net/netfilter/xt_bpf.c. Folks, ONCE DESCRIPTOR IS INSTALLED, THAT'S >> > IT; THERE'S NO REMOVING IT ON FAILURE EXITS. sys_close() should >> > never, ever be used that way. Sigh... >> >> Would be great do unexport the thing. Except that we also have >> binfmt_misc (which looks legit) and autofs4, which on crack decided >> that close() isn't a fun syscall, they'd much rather have an ioctl >> that does exactly the same.. > > Yes, since binfmt_misc one is guaranteed that its descriptor table is > not shared - all callchains go through do_execveat_common(), where we'd > use unshare_files(). autofs one is... not in good taste, but still > safe; there the descriptor is preexisting and it's essentially a weird > way of spelling close(2). References from syscall tables are, of course, > OK. init/*.c uses are done pretty much from userland - they could have > been straight syscalls, if not for the lack of klibc in kernel tree. > Everything else, though... > > IMO we need a whack-a-mole list somewhere; "new callers of sys_close() > anywhere outside of init/* and syscall tables" definitely should be > on it... #syz fix: fix kcm_clone()