I finished up some other work and got around to writing a v5 today, but I ran into a problem with /proc/[pid]/userfaultfd. Files in /proc/[pid]/* are owned by the user/group which started the process, and they don't support being chmod'ed. For the userfaultfd device, I think we want the following semantics: - For UFFDs created via the device, we want to always allow handling kernel mode faults - For security, the device should be owned by root:root by default, so unprivileged users don't have default access to handle kernel faults - But, the system administrator should be able to chown/chmod it, to grant access to handling kernel faults for this process more widely. It could be made to work like that but I think it would involve at least: - Special casing userfaultfd in proc_pid_make_inode - Updating setattr/getattr for /proc/[pid] to meaningfully store and then retrieve uid/gid different from the task's, again probably special cased for userfautlfd since we don't want this behavior for other files It seems to me such a change might raise eyebrows among procfs folks. Before I spend the time to write this up, does this seem like something that would obviously be nack'ed? On Wed, Jul 20, 2022 at 4:21 PM Nadav Amit <namit@xxxxxxxxxx> wrote: > > On Jul 20, 2022, at 4:04 PM, Axel Rasmussen <axelrasmussen@xxxxxxxxxx> wrote: > > > ⚠ External Email > > > > On Wed, Jul 20, 2022 at 3:16 PM Schaufler, Casey > > <casey.schaufler@xxxxxxxxx> wrote: > >>> -----Original Message----- > >>> From: Axel Rasmussen <axelrasmussen@xxxxxxxxxx> > >>> Sent: Tuesday, July 19, 2022 12:56 PM > >>> To: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>; Andrew Morton > >>> <akpm@xxxxxxxxxxxxxxxxxxxx>; Dave Hansen > >>> <dave.hansen@xxxxxxxxxxxxxxx>; Dmitry V . Levin <ldv@xxxxxxxxxxxx>; Gleb > >>> Fotengauer-Malinovskiy <glebfm@xxxxxxxxxxxx>; Hugh Dickins > >>> <hughd@xxxxxxxxxx>; Jan Kara <jack@xxxxxxx>; Jonathan Corbet > >>> <corbet@xxxxxxx>; Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>; Mike > >>> Kravetz <mike.kravetz@xxxxxxxxxx>; Mike Rapoport <rppt@xxxxxxxxxx>; > >>> Amit, Nadav <namit@xxxxxxxxxx>; Peter Xu <peterx@xxxxxxxxxx>; > >>> Shuah Khan <shuah@xxxxxxxxxx>; Suren Baghdasaryan > >>> <surenb@xxxxxxxxxx>; Vlastimil Babka <vbabka@xxxxxxx>; zhangyi > >>> <yi.zhang@xxxxxxxxxx> > >>> Cc: Axel Rasmussen <axelrasmussen@xxxxxxxxxx>; linux- > >>> doc@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; linux- > >>> kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux- > >>> kselftest@xxxxxxxxxxxxxxx > >>> Subject: [PATCH v4 0/5] userfaultfd: add /dev/userfaultfd for fine grained > >>> access control > >> > >> I assume that leaving the LSM mailing list off of the CC is purely > >> accidental. Please, please include us in the next round. > > > > Honestly it just hadn't occurred to me, but I'm more than happy to CC > > it on future revisions. > > > >>> This series is based on torvalds/master. > >>> > >>> The series is split up like so: > >>> - Patch 1 is a simple fixup which we should take in any case (even by itself). > >>> - Patches 2-6 add the feature, configurable selftest support, and docs. > >>> > >>> Why not ...? > >>> ============ > >>> > >>> - Why not /proc/[pid]/userfaultfd? The proposed use case for this is for one > >>> process to open a userfaultfd which can intercept another process' page > >>> faults. This seems to me like exactly what CAP_SYS_PTRACE is for, though, > >>> so I > >>> think this use case can simply use a syscall without the powers > >>> CAP_SYS_PTRACE > >>> grants being "too much". > >>> > >>> - Why not use a syscall? Access to syscalls is generally controlled by > >>> capabilities. We don't have a capability which is used for userfaultfd access > >>> without also granting more / other permissions as well, and adding a new > >>> capability was rejected [1]. > >>> > >>> - It's possible a LSM could be used to control access instead. I suspect > >>> adding a brand new one just for this would be rejected, > >> > >> You won't know if you don't ask. > > > > Fair enough - I wonder if MM folks (Andrew, Peter, Nadav especially) > > would find that approach more palatable than /proc/[pid]/userfaultfd? > > Would it make sense from our perspective to propose a userfaultfd- or > > MM-specific LSM for controlling access to certain features? > > > > I remember +Andrea saying Red Hat was also interested in some kind of > > access control mechanism like this. Would one or the other approach be > > more convenient for you? > > To reiterate my position - I think that /proc/[pid]/userfaultfd is very > natural and can be easily extended to support cross-process access of > userfaultfd. The necessary access controls are simple in any case. For > cross-process access, they are similar to those that are used for other > /proc/[pid]/X such as pagemap. > > I have little experience with LSM and I do not know how real deployments use > them. If they are easier to deploy and people prefer them over some > pseudo-file, I cannot argue against them. > >