Re: [RFC PATCH 4/9] User-space API for creating a supervisor-fd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 10, 2025 at 12:41:28AM +0000, Tingmao Wang wrote:
> On 3/5/25 16:09, Mickaël Salaün wrote:
> > On Tue, Mar 04, 2025 at 01:13:00AM +0000, Tingmao Wang wrote:
> > > We allow the user to pass in an additional flag to landlock_create_ruleset
> > > which will make the ruleset operate in "supervise" mode, with a supervisor
> > > attached. We create additional space in the landlock_ruleset_attr
> > > structure to pass the newly created supervisor fd back to user-space.
> > > 
> > > The intention, while not implemented yet, is that the user-space will read
> > > events from this fd and write responses back to it.
> > > 
> > > Note: need to investigate if fd clone on fork() is handled correctly, but
> > > should be fine if it shares the struct file. We might also want to let the
> > > user customize the flags on this fd, so that they can request no
> > > O_CLOEXEC.
> > > 
> > > NOTE: despite this patch having a new uapi, I'm still very open to e.g.
> > > re-using fanotify stuff instead (if that makes sense in the end). This is
> > > just a PoC.
> > 
> > The main security risk of this feature is for this FD to leak and be
> > used by a sandboxed process to bypass all its restrictions.  This should
> > be highlighted in the UAPI documentation.
> > 
> > > 
> > > Signed-off-by: Tingmao Wang <m@xxxxxxxxxx>
> > > ---
> > >   include/uapi/linux/landlock.h |  10 ++++
> > >   security/landlock/syscalls.c  | 102 +++++++++++++++++++++++++++++-----
> > >   2 files changed, 98 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> > > index e1d2c27533b4..7bc1eb4859fb 100644
> > > --- a/include/uapi/linux/landlock.h
> > > +++ b/include/uapi/linux/landlock.h
> > > @@ -50,6 +50,15 @@ struct landlock_ruleset_attr {
> > >   	 * resources (e.g. IPCs).
> > >   	 */
> > >   	__u64 scoped;
> > > +	/**
> > > +	 * @supervisor_fd: Placeholder to store the supervisor file
> > > +	 * descriptor when %LANDLOCK_CREATE_RULESET_SUPERVISE is set.
> > > +	 */
> > > +	__s32 supervisor_fd;
> > 
> > This interface would require the ruleset_attr becoming updatable by the
> > kernel, which might be OK in theory but requires current syscall wrapper
> > signature update, see sandboxer.c change.  It also creates a FD which
> > might not be useful (e.g. if an error occurs before the actual
> > enforcement).
> > 
> > I see a few alternatives.  We could just use/extend the ruleset FD
> > instead of creating a new one, but because leaking current rulesets is
> > not currently a security risk, we should be careful to not change that.
> > 
> > Another approach, similar to seccomp unotify, is to get a
> > "[landlock-domain]" FD returned by the landlock_restrict_self(2) when a
> > new LANDLOCK_RESTRICT_SELF_DOMAIN_FD flag is set.  This FD would be a
> > reference to the newly created domain, which is more specific than the
> > ruleset used to created this domain (and that can be used to create
> > other domains).  This domain FD could be used for introspection (i.e.
> > to get read-only properties such as domain ID), but being able to
> > directly supervise the referenced domain only with this FD would be a
> > risk that we should limit.
> > 
> > What we can do is to implement an IOCTL command for such domain FD that
> > would return a supervisor FD (if the LANDLOCK_RESTRICT_SELF_SUPERVISED
> > flag was also set).  The key point is to check (one time) that the
> > process calling this IOCTL is not restricted by the related domain (see
> > the scope helpers).
> 
> Is LANDLOCK_RESTRICT_SELF_DOMAIN_FD part of your (upcoming?) introspection
> patch? (thinking about when will someone pass that only and not
> LANDLOCK_RESTRICT_SELF_SUPERVISED, or vice versa)

I don't plan to work on such LANDLOCK_RESTRICT_SELF_DOMAIN_FD flag for
now, but the introspection feature(s) would help for this supervisor
feature.

> 
> By the way, is it alright to conceptually relate the supervisor to a domain?
> It really would be a layer inside a domain - the domain could have earlier
> or later layers which can deny access without supervision, or the supervisor
> for earlier layers can deny access first. Therefore having supervisor fd
> coming out of the ruleset felt sensible to me at first.

Good question.  I've been using the name "domain" to refer to the set of
restrictions enforced on a set of processes, but these restrictions are
composed of inherited ones plus the latest layer.  In this case, a
domain FD should refer to all the restrictions, but the supervisor FD
should indeed only refer to the latest layer of a domain (created by
landlock_restrict_self).

> 
> Also, isn't "check that process calling this IOCTL is not restricted by the
> related domain" and the fact that the IOCTL is on the domain fd, which is a
> return value of landlock_restrict_self, kind of contradictory?  I mean it is
> a sensible check, but that kind of highlights that this interface is
> slightly awkward - basically all callers are forced to have a setup where
> the child sends the domain fd back to the parent.

I agree that its confusing.  I'd like to avoid the ruleset to gain any
control on domains after they are created.

Another approach would be to create a supervisor FD with the
landlock_create_ruleset() syscall, and pass this FD to the ruleset,
potentially with landlock_add_rule() calls to only request this
supervisor when matching specific rules (that could potentially be
catch-all rules)?

Overall, my main concern about this patch series is that the supervisor
could get a lot of requests, which will make the sandbox unusable
because always blocked by some thread/process.  This latest approach and
the ability to update the domain somehow could make it workable.

> 
> > 
> > Relying on IOCTL commands (for all these FD types) instead of read/write
> > operations should also limit the risk of these FDs being misused through
> > a confused deputy attack (because such IOCTL command would convey an
> > explicit intent):
> > https://docs.kernel.org/security/credentials.html#open-file-credentials
> > https://lore.kernel.org/all/CAG48ez0HW-nScxn4G5p8UHtYy=T435ZkF3Tb1ARTyyijt_cNEg@xxxxxxxxxxxxxx/
> > We should get inspiration from seccomp unotify for this too:
> > https://lore.kernel.org/all/20181209182414.30862-1-tycho@xxxxxxxx/
> 
> I think in the seccomp unotify case the problem arises from what the setuid
> binary thinks is just normal data getting interpreted by the kernel as a fd,
> and thus having different effect if the attacker writes it vs. if the suid
> app writes it.  In our case I *think* we should be alright, but maybe we
> should go with ioctl anyway...

I don't see why Jann's attack scenario could work for this Landlock
supervisor too.  The main point that it the read/write interfaces are
used by a lot of different FDs, and we may not need them.

> However, how does using netlink messages (a
> suggestion from a different thread) affect this (if we do end up using it)?
> Would we have to do netlink msgs via IOCTL?

Because all requests should be synchronous, one IOCTL could be used to
both acknowledge a previous event (or just start) and read the next one.

I was thinking about an IOCTL with these arguments:
1. supervisor FD
2. (extensible) IOCTL command (see PIDFD_GET_INFO for instance)
3. pointer to a fixed-size control structure

The fixed-size control structure could contain:
- handled access rights, used to only get event related to specific
  access.
- flags, to specify which kind of FD we would like to get (e.g. only
  directory FD, pidfd...)
- fd[6]: an array of received file descriptors.
- pointer to a variable-size data buffer that would contain all the
  records (e.g. source dir FD, source file name, destination dir FD,
  destination file name) for one event, potentially formatted with NLA.
- the size of this buffer

I'm not sure about the content of this buffer and the NLA format, and
the related API might not be usable without netlink sockets though.
Taking inspiration from the fanotify message format is another option.

> 
> 
> > > +	/**
> > > +	 * @pad: Unused, must be zero.
> > > +	 */
> > > +	__u32 pad;
> > 
> > In this case we should pack the struct instead.
> > 
> > >   };
> > >   /*
> > > @@ -60,6 +69,7 @@ struct landlock_ruleset_attr {
> > >    */
> > >   /* clang-format off */
> > >   #define LANDLOCK_CREATE_RULESET_VERSION			(1U << 0)
> > > +#define LANDLOCK_CREATE_RULESET_SUPERVISE		(1U << 1)
> > >   /* clang-format on */
> > >   /**
> > 
> > [...]
> 
> 




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux