On Tue, Jul 10, 2018 at 6:15 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > On Tue, Jul 10, 2018 at 06:05:49PM -0700, Linus Torvalds wrote: >> Yeah, Andy is right that we should *not* make "write()" have side effects. >> >> Use it to queue things by all means, but not "do" things. Not unless >> there's a very sane security model. >> >> On Tue, Jul 10, 2018 at 4:59 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: >> > >> > I think the right solution is one of: >> > >> > (a) Pass a netlink-formatted blob to fsopen() and do the whole thing in one syscall. I don’t mean using netlink sockets — just the nlattr format. Or you could use a different format. The part that matters is using just one syscall to do the whole thing. >> >> Please no. Not another nasty marshalling thing. >> >> > (b) Keep the current structure but use a new syscall instead of write(). >> > >> > (c) Keep using write() but literally just buffer the data. Then have a new syscall to commit it. In other words, replace “x” with a syscall and call all the fs_context_operations helpers in that context instead of from write(). >> >> But yeah, b-or-c sounds fine. > > Umm... How about "use credentials of opener for everything"? If you want to audit every single filesystem for any code that uses credentials for anything and add all the right kernel APIs and make sure the filesystem uses them and somehow keep screwups from getting added down the line, then okay I guess. As far as I know, we don't even *have* an API for "open this device node using this struct cred *". I kind of want to add a hack to set some poison bit in current->cred in sys_write() and clear it on the way out. Sigh. --Andy