On Sat, Nov 21, 2020 at 12:41:23AM +0100, Jann Horn wrote: > On Thu, Nov 19, 2020 at 8:03 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > On Wed, Nov 18, 2020 at 9:18 PM Omar Sandoval <osandov@xxxxxxxxxxx> wrote: > > > The upcoming RWF_ENCODED operation introduces some security concerns: > > > > > > 1. Compressed writes will pass arbitrary data to decompression > > > algorithms in the kernel. > > > 2. Compressed reads can leak truncated/hole punched data. > > > > > > Therefore, we need to require privilege for RWF_ENCODED. It's not > > > possible to do the permissions checks at the time of the read or write > > > because, e.g., io_uring submits IO from a worker thread. So, add an open > > > flag which requires CAP_SYS_ADMIN. It can also be set and cleared with > > > fcntl(). The flag is not cleared in any way on fork or exec. It must be > > > combined with O_CLOEXEC when opening to avoid accidental leaks (if > > > needed, it may be set without O_CLOEXEC by using fnctl()). > > > > > > Note that the usual issue that unknown open flags are ignored doesn't > > > really matter for O_ALLOW_ENCODED; if the kernel doesn't support > > > O_ALLOW_ENCODED, then it doesn't support RWF_ENCODED, either. > [...] > > > diff --git a/fs/open.c b/fs/open.c > > > index 9af548fb841b..f2863aaf78e7 100644 > > > --- a/fs/open.c > > > +++ b/fs/open.c > > > @@ -1040,6 +1040,13 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op) > > > acc_mode = 0; > > > } > > > > > > + /* > > > + * O_ALLOW_ENCODED must be combined with O_CLOEXEC to avoid accidentally > > > + * leaking encoded I/O privileges. > > > + */ > > > + if ((how->flags & (O_ALLOW_ENCODED | O_CLOEXEC)) == O_ALLOW_ENCODED) > > > + return -EINVAL; > > > + > > > > > > dup() can also result in accidental leak. > > We could fail dup() of fd without O_CLOEXEC. Should we? > > > > If we should than what error code should it be? We could return EPERM, > > but since we do allow to clear O_CLOEXEC or set O_ALLOW_ENCODED > > after open, EPERM seems a tad harsh. > > EINVAL seems inappropriate because the error has nothing to do with > > input args of dup() and EBADF would also be confusing. > > This seems very arbitrary to me. Sure, leaking these file descriptors > wouldn't be great, but there are plenty of other types of file > descriptors that are probably more sensitive. (Writable file > descriptors to databases, to important configuration files, to > io_uring instances, and so on.) So I don't see why this specific > feature should impose such special rules on it. I agree with Jann. I'm okay with the O_CLOEXEC-on-open requirement if it makes people more comfortable, but I don't think we should be bending over backwards to block it anywhere else.