On Mon, Mar 01, 2021 at 02:21:56PM +0100, Christian Brauner wrote: > On Mon, Mar 01, 2021 at 12:09:09PM +0100, Christian Brauner wrote: > > On Sat, Feb 20, 2021 at 01:31:57AM -0800, Sargun Dhillon wrote: > > > We've run into a problem where attaching a filter can be quite messy > > > business because the filter itself intercepts sendmsg, and other > > > syscalls related to exfiltrating the listener FD. I believe that this > > > problem set has been brought up before, and although there are > > > "simpler" methods of exfiltrating the listener, like clone3 or > > > pidfd_getfd, but these are still less than ideal. I'm trying to make sure I understand: the target process would like to have a filter attached that blocks sendmsg, but that would mean it has no way to send the listener FD to its manager? And you'd want to have listening working for sendmsg (otherwise you could do it with two filters, I imagine)? > > int fd_filter = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_DETACHED, &prog); > > > > BARRIER_WAIT_SETUP_DONE; > > > > int ret = seccomp(SECCOMP_ATTACH_FILTER, 0, INT_TO_PTR(fd_listener)); > > This obviously should've been sm like: > > struct seccomp_filter_attach { > union { > __s32 pidfd; > __s32 pid; > }; > __u32 fd_filter; > }; > > and then > > int ret = seccomp(SECCOMP_ATTACH_FILTER, 0, seccomp_filter_attach); Given the difficulty with TSYNC, I'm not excited about adding an "apply this filter to another process" API. :) The prior thread was here: https://lore.kernel.org/lkml/20201029075841.GB29881@ircssh-2.c.rugged-nimbus-611.internal/ But I haven't had time to follow up. Both Andy and Sargun discuss filter "replacement", but I'm not a fan of that, since I'd really like to keep the "additive-only" property of seccomp. So, I'm still back to wanting an answer to my questions at the end of https://lore.kernel.org/lkml/202010281503.3D1FCFE0@keescook/ Namely, how to best indicate the point of execution where "delayed" filters become applied? If we require supporting the "2b" (launched oblivious target) case (which I think we must), we need to signal it externally, or via an automatic trip point. Since synchronizing with an oblivious target is rather nasty (e.g. involving ptrace or at least ptrace access checking), I'd rather create a predefined trip point. Having it be "execve" limits the utility of this feature for cooperating targets, though, so I think "apply on exec" isn't great. struct seccomp_filter_attach_trigger { u64 nr; unsigned char *filter; }; seccomp(SECCOMP_ATTACH_FILTER_TRIGGER, 0, seccomp_filter_attach_trigger); after "nr" is evaluated (but before it runs), seccomp installs the filter. And by "installs", I'm not sure if it needs to keep it in a queue, with separate ref coutning, or if it should be in the main filter stack, but have an "alive" toggle, or what. -- Kees Cook _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers