On Tue, Jun 12, 2018 at 4:16 PM, Tycho Andersen <tycho@xxxxxxxx> wrote: > Hi Matthew, > > On Tue, Jun 12, 2018 at 02:39:03PM -0700, Matthew Helsley wrote: > > On Thu, May 31, 2018 at 7:49 AM, Tycho Andersen <tycho@xxxxxxxx> wrote: > > > > <snip> > > > > > > > +struct seccomp_notif { > > > + __u64 id; > > > + pid_t pid; > > > + struct seccomp_data data; > > > +}; > > > > > > > Since it's part of the UAPI I think it would be good to add documentation > > to this patch series. Part of that documentation should talk about which > > pid namespaces this pid value is relevant in. This is especially > important > > since the feature is designed for use by things like container/sandbox > > managers. > > Yes, at least Documentation/userspace-api/seccomp_filter.txt should be > updated. I'll add that for the next series. > Are there some relevant man pages too that should be updated too? > > > > +static void seccomp_do_user_notification(int this_syscall, > > > + struct seccomp_filter *match, > > > + const struct seccomp_data *sd) > > > +{ > > > + int err; > > > + long ret = 0; > > > + struct seccomp_knotif n = {}; > > > + > > > + mutex_lock(&match->notify_lock); > > > + err = -ENOSYS; > > > + if (!match->has_listener) > > > + goto out; > > > + > > > + n.pid = current->pid; > > > > > > > How have you tested this code for correctness? I don't see many > > combinations being tested below nor here: > > https://github.com/tych0/kernel-utils/blob/master/ > seccomp/notify_stress.c > > > > What about processes in different pid namespaces? Make tests that vary > key > > parameters like these between the task generating the notifications and > the > > task interested in processing the notifications. Make tests that try to > > kill them at interesting times too. etc. Make tests that pass the > > notification fd around and see how they work (or not). > > > > I ask about testing because you're effectively returning a pid value to > > userspace here but not using the proper macros to access the task's > struct > > pid for that purpose. That will work so long as both processes are in the > > same pid namespace but breaks otherwise. > > > > Furthermore, a pid value is not the best solution for the queueing of > these > > notifications because a single pid value is not meaningful/correct > outside > > current's pid namespace and the seccomp notification file descriptor > could > > be passed on to processes in another pid namespaces; this pid value will > > almost always not be relevant or correct there hence taking a reference > to > > Well, it *has* to be relevant in some cases: if you want to access > some of the task's memory to e.g. read the args to the syscall, you > need to ptrace or map_files to access the memory. So we do need to > pass it, but, > > > the struct pid is useful. Then, during read(), the code would use the > > proper macro to turn the struct pid reference into a pid value relevant > in > > the *reader's* pid namespace *or* return something obviously bogus if the > > reader is in a pid namespace that can't see that pid. This could prevent > > management processes from being tricked into clobbering the wrong > process, > > feeding the wrong process sensitive information via syscall results, etc. > > Yes, this makes sense. Seems like we need to do some magic about > passing the tracee's task struct to the listener, so it can do > task_pid_vnr(). We could perhaps require the listener to be in the > init_pid_ns case, but I think things like socket() might still be > useful even if you can't read the tracee's memory. You could take a reference to the pid rather than the task then use pid_vnr(). In that case the changes should result in something like: struct seccomp_knotif { /* The pid whose filter triggered the notification */ struct pid *pid; ... static void seccomp_do_user_notification(...) { ... n.pid = get_task_pid(current, PIDTYPE_PID); ... remove_list: list_del(&n.list); put_pid(n.pid); ... } ... static ssize_t seccomp_notify_read(...) { ... unotif.pid = pid_vnr(knotif->pid); ... } I like holding the pid reference because it's what we do elsewhere when pid namespaces are a concern and it more precisely specifies what the knotif content needs to convey. Otherwise I don't think it makes a difference. Cheers, -Matt _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers