On Wed, Feb 12, 2020 at 11:10 AM Stephen Smalley <sds@xxxxxxxxxxxxx> wrote: > > On 2/12/20 2:04 PM, Daniel Colascione wrote: > > On Wed, Feb 12, 2020 at 10:59 AM Stephen Smalley <sds@xxxxxxxxxxxxx> wrote: > >> > >> On 2/12/20 1:04 PM, Stephen Smalley wrote: > >>> On 2/12/20 12:19 PM, Daniel Colascione wrote: > >>>> Thanks for taking a look. > >>>> > >>>> On Wed, Feb 12, 2020 at 9:04 AM Stephen Smalley <sds@xxxxxxxxxxxxx> > >>>> wrote: > >>>>> > >>>>> On 2/11/20 5:55 PM, Daniel Colascione wrote: > >>>>>> Use the secure anonymous inode LSM hook we just added to let SELinux > >>>>>> policy place restrictions on userfaultfd use. The create operation > >>>>>> applies to processes creating new instances of these file objects; > >>>>>> transfer between processes is covered by restrictions on read, write, > >>>>>> and ioctl access already checked inside selinux_file_receive. > >>>>>> > >>>>>> Signed-off-by: Daniel Colascione <dancol@xxxxxxxxxx> > >>>>> > >>>>> (please add linux-fsdevel and viro to the cc for future versions of this > >>>>> patch since it changes the VFS) > >>>>> > >>>>>> --- > >>>>>> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c > >>>>>> index 1659b59fb5d7..e178f6f40e93 100644 > >>>>>> --- a/security/selinux/hooks.c > >>>>>> +++ b/security/selinux/hooks.c > >>>>>> @@ -2915,6 +2919,69 @@ static int selinux_inode_init_security(struct > >>>>>> inode *inode, struct inode *dir, > >>>>>> + > >>>>>> + /* > >>>>>> + * We shouldn't be creating secure anonymous inodes before LSM > >>>>>> + * initialization completes. > >>>>>> + */ > >>>>>> + if (unlikely(!selinux_state.initialized)) > >>>>>> + return -EBUSY; > >>>>> > >>>>> I don't think this is viable; any arbitrary actions are possible before > >>>>> policy is loaded, and a Linux distro can be brought up fully with > >>>>> SELinux enabled and no policy loaded. You'll just need to have a > >>>>> default behavior prior to initialization. > >>>> > >>>> We'd have to fail open then, I think, and return an S_PRIVATE inode > >>>> (the regular anon inode). > >>> > >>> Not sure why. You aren't doing anything in the hook that actually > >>> relies on selinux_state.initialized being set (i.e. nothing requires a > >>> policy). The avc_has_perm() call will just succeed until a policy is > >>> loaded. So if these inodes are created prior to policy load, they will > >>> get assigned the task SID (which would be the kernel SID prior to policy > >>> load or first exec or write to /proc/self/attr/current afterward) and > >>> UFFD class (in your current code), be permitted, and then once policy is > >>> loaded any further access will get checked against the kernel SID. > >>> > >>>>>> + /* > >>>>>> + * We only get here once per ephemeral inode. The inode has > >>>>>> + * been initialized via inode_alloc_security but is otherwise > >>>>>> + * untouched, so check that the state is as > >>>>>> + * inode_alloc_security left it. > >>>>>> + */ > >>>>>> + BUG_ON(isec->initialized != LABEL_INVALID); > >>>>>> + BUG_ON(isec->sclass != SECCLASS_FILE); > >>>>> > >>>>> I think the kernel discourages overuse of BUG_ON/BUG/... > >>>> > >>>> I'm not sure what counts as overuse. > >>> > >>> Me either (not my rule) but I'm pretty sure this counts or you'd see a > >>> lot more of these kinds of BUG_ON() checks throughout. Try to reserve > >>> them for really critical cases. > >>> > >>>>>> + > >>>>>> +#ifdef CONFIG_USERFAULTFD > >>>>>> + if (fops == &userfaultfd_fops) > >>>>>> + isec->sclass = SECCLASS_UFFD; > >>>>>> +#endif > >>>>> > >>>>> Not sure we want or need to introduce a new security class for each user > >>>>> of anonymous inodes since the permissions should be the same as for > >>>>> file. > >>>> > >>>> The purpose of this change is to apply special policy to userfaultfd > >>>> FDs in particular. Isn't having a UFFD security class the best way to > >>>> go about that? (There's no path.) Am I missing something? > >>> > >>> It is probably the simplest approach; it just doesn't generalize to all > >>> users of anonymous inodes. We can distinguish them in one of two ways: > >>> use a different class like you did (requires a code change every time we > >>> add a new one and yet another duplicate of the file class) or use a > >>> different SID/context/type. The latter could be achieved by calling > >>> security_transition_sid() with the provided name wrapped in a qstr and > >>> specifying type_transition rules on the name. Then policy could define > >>> derived types for each domain, ala > >>> type_transition init self:file "[userfaultfd]" init_userfaultfd; > >>> type_transition untrusted_app self:file "[userfaultfd]" > >>> untrusted_app_userfaultfd; > >>> ... > >>> > >>>>> Also not sure we want to be testing fops for each such case. > >>>> > >>>> I was also thinking of just providing some kind of context string > >>>> (maybe the name), which might be friendlier to modules, but the loose > >>>> coupling kind of scares me, and for this particular application, since > >>>> UFFD is always in the core and never in a module, checking the fops > >>>> seems a bit more robust and doesn't hurt anything. > >>> > >>> Yes, not sure how the vfs folks feel about either coupling (the > >>> name-based one or the fops-based one). Neither seems great. > >>> > >>>>> We > >>>>> were looking at possibly leveraging the name as a key and using > >>>>> security_transition_sid() to generate a distinct SID/context/type for > >>>>> the inode via type_transition rules in policy. We have some WIP along > >>>>> those lines. > >>>> > >>>> Where? Any chance it would be ready soon? I'd rather not hold up this > >>>> work for a more general mechanism. > >>> > >>> Hopefully will have a patch available soon. But not saying this > >>> necessarily has to wait either. > >>> > >>>>>> + /* > >>>>>> + * Always give secure anonymous inodes the sid of the > >>>>>> + * creating task. > >>>>>> + */ > >>>>>> + > >>>>>> + isec->sid = tsec->sid; > >>>>> > >>>>> This doesn't generalize for other users of anonymous inodes, e.g. the > >>>>> /dev/kvm case where we'd rather inherit the SID and class from the > >>>>> original /dev/kvm inode itself. > >>>> > >>>> I think someone mentioned on the first version of this patch that we > >>>> could make it more flexible if the need arose. If we do want to do it > >>>> now, we could have the anon_inode security hook accept a "parent" or > >>>> "context" inode that modules could inspect for the purposes of forming > >>>> the new inode's SID. Does that make sense to you? > >>> > >>> Yes, that's the approach in our current WIP, except we call it a > >>> "related" inode since it isn't necessarily connected to the anon inode > >>> in any vfs sense. > >> > >> The other key difference in our WIP approach is that we assumed that we > >> couldn't mandate allocating a separate anon inode for each of these fds > >> and we wanted to cover all anonymous inodes (not opt-in), so we are > >> storing the SID/class pair as additional fields in the > >> file_security_struct and have modified file_has_perm() and others to > >> look there for anonymous inodes. > > > > A separate inode seems like the simpler approach for now, because it > > means that we have fewer places to check for security information --- > > and it's not as if an inode is particularly expensive. We can always > > switch later. > > We'd prefer having a separate inode if possible but didn't think that > would fly with the vfs folks, Let's ask them. > especially if we try to apply this to all > anonymous inodes. For the moment, we're not. > It might be ok for userfaultfd usage as a specific > case but there is a reason why anonymous inodes were introduced and > creating a separate inode each time defeats that purpose IIUC. It will > be interesting to see how they respond. Sort of. Anonymous inodes also free other parts of the kernel from having to deal with special-purpose filesystems (like pipefs) on which to hang custom inodes. It's just a generic "just give me an inode and I don't care about the filesystem" feature, and if we actually get a new inode each time, we still do the job. Pipe seems to be good with creating inodes each time.