Re: [PATCH 24/32] vfs: syscall: Add fsopen() to prepare for superblock creation [ver #9]

David Howells <dhowells@xxxxxxxxxx> · Thu, 12 Jul 2018 15:54:04 +0100

Andy Lutomirski <luto@xxxxxxxxxx> wrote:

> > On Jul 11, 2018, at 12:22 AM, David Howells <dhowells@xxxxxxxxxx> wrote:
> >
> > Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> >
> >>>   sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> >>>   write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
> >>
> >> Imagine some malicious program passes sfd as stdout to a setuid
> >> program. That program gets persuaded to write "s /etc/shadow".  What
> >> happens?  You’re okay as long as *every single fs* gets it right, but
> >> that’s asking a lot.
> >
> > Do note that you must already have CAP_SYS_ADMIN to be able to call
> > fsopen().
> 
> If you're not allowing it already, someone will want user namespace
> root to be able to use this very, very soon.

Yeah, I'm sure.  And I've been thinking on how to deal with it.

I think we *have* to open the source files/devices with the creds of whoever
called fsopen() or fspick() - that way you can't upgrade your privs by passing
your context fd to a suid program.  To enforce this, I think it's simplest for
fscontext_write() to call override_creds() right after taking the uapi_mutex
and then call revert_creds() right before dropping the mutex.

Another thing we might want to look at is to allow a supervisory process to
examine the context before permitting the create/reconfigure action to
proceed.  It might also be possible to do this through the LSM.

David