Re: [PATCH v4 0/5] userfaultfd: add /dev/userfaultfd for fine grained access control

Axel Rasmussen <axelrasmussen@xxxxxxxxxx> · Mon, 1 Aug 2022 10:13:12 -0700

I finished up some other work and got around to writing a v5 today,
but I ran into a problem with /proc/[pid]/userfaultfd.

Files in /proc/[pid]/* are owned by the user/group which started the
process, and they don't support being chmod'ed.

For the userfaultfd device, I think we want the following semantics:
- For UFFDs created via the device, we want to always allow handling
kernel mode faults
- For security, the device should be owned by root:root by default, so
unprivileged users don't have default access to handle kernel faults
- But, the system administrator should be able to chown/chmod it, to
grant access to handling kernel faults for this process more widely.

It could be made to work like that but I think it would involve at least:

- Special casing userfaultfd in proc_pid_make_inode
- Updating setattr/getattr for /proc/[pid] to meaningfully store and
then retrieve uid/gid different from the task's, again probably
special cased for userfautlfd since we don't want this behavior for
other files

It seems to me such a change might raise eyebrows among procfs folks.
Before I spend the time to write this up, does this seem like
something that would obviously be nack'ed?

On Wed, Jul 20, 2022 at 4:21 PM Nadav Amit <namit@xxxxxxxxxx> wrote:
>
> On Jul 20, 2022, at 4:04 PM, Axel Rasmussen <axelrasmussen@xxxxxxxxxx> wrote:
>
> > ⚠ External Email
> >
> > On Wed, Jul 20, 2022 at 3:16 PM Schaufler, Casey
> > <casey.schaufler@xxxxxxxxx> wrote:
> >>> -----Original Message-----
> >>> From: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>
> >>> Sent: Tuesday, July 19, 2022 12:56 PM
> >>> To: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>; Andrew Morton
> >>> <akpm@xxxxxxxxxxxxxxxxxxxx>; Dave Hansen
> >>> <dave.hansen@xxxxxxxxxxxxxxx>; Dmitry V . Levin <ldv@xxxxxxxxxxxx>; Gleb
> >>> Fotengauer-Malinovskiy <glebfm@xxxxxxxxxxxx>; Hugh Dickins
> >>> <hughd@xxxxxxxxxx>; Jan Kara <jack@xxxxxxx>; Jonathan Corbet
> >>> <corbet@xxxxxxx>; Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>; Mike
> >>> Kravetz <mike.kravetz@xxxxxxxxxx>; Mike Rapoport <rppt@xxxxxxxxxx>;
> >>> Amit, Nadav <namit@xxxxxxxxxx>; Peter Xu <peterx@xxxxxxxxxx>;
> >>> Shuah Khan <shuah@xxxxxxxxxx>; Suren Baghdasaryan
> >>> <surenb@xxxxxxxxxx>; Vlastimil Babka <vbabka@xxxxxxx>; zhangyi
> >>> <yi.zhang@xxxxxxxxxx>
> >>> Cc: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>; linux-
> >>> doc@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; linux-
> >>> kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-
> >>> kselftest@xxxxxxxxxxxxxxx
> >>> Subject: [PATCH v4 0/5] userfaultfd: add /dev/userfaultfd for fine grained
> >>> access control
> >>
> >> I assume that leaving the LSM mailing list off of the CC is purely
> >> accidental. Please, please include us in the next round.
> >
> > Honestly it just hadn't occurred to me, but I'm more than happy to CC
> > it on future revisions.
> >
> >>> This series is based on torvalds/master.
> >>>
> >>> The series is split up like so:
> >>> - Patch 1 is a simple fixup which we should take in any case (even by itself).
> >>> - Patches 2-6 add the feature, configurable selftest support, and docs.
> >>>
> >>> Why not ...?
> >>> ============
> >>>
> >>> - Why not /proc/[pid]/userfaultfd? The proposed use case for this is for one
> >>> process to open a userfaultfd which can intercept another process' page
> >>> faults. This seems to me like exactly what CAP_SYS_PTRACE is for, though,
> >>> so I
> >>> think this use case can simply use a syscall without the powers
> >>> CAP_SYS_PTRACE
> >>> grants being "too much".
> >>>
> >>> - Why not use a syscall? Access to syscalls is generally controlled by
> >>> capabilities. We don't have a capability which is used for userfaultfd access
> >>> without also granting more / other permissions as well, and adding a new
> >>> capability was rejected [1].
> >>>
> >>> - It's possible a LSM could be used to control access instead. I suspect
> >>> adding a brand new one just for this would be rejected,
> >>
> >> You won't know if you don't ask.
> >
> > Fair enough - I wonder if MM folks (Andrew, Peter, Nadav especially)
> > would find that approach more palatable than /proc/[pid]/userfaultfd?
> > Would it make sense from our perspective to propose a userfaultfd- or
> > MM-specific LSM for controlling access to certain features?
> >
> > I remember +Andrea saying Red Hat was also interested in some kind of
> > access control mechanism like this. Would one or the other approach be
> > more convenient for you?
>
> To reiterate my position - I think that /proc/[pid]/userfaultfd is very
> natural and can be easily extended to support cross-process access of
> userfaultfd. The necessary access controls are simple in any case. For
> cross-process access, they are similar to those that are used for other
> /proc/[pid]/X such as pagemap.
>
> I have little experience with LSM and I do not know how real deployments use
> them. If they are easier to deploy and people prefer them over some
> pseudo-file, I cannot argue against them.
>
>