On Fri, May 29, 2020 at 01:32:03AM +0200, Jann Horn wrote: > On Fri, May 29, 2020 at 1:11 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > On Thu, May 28, 2020 at 05:14:11PM +0200, Christian Brauner wrote: > > > * @usage: reference count to manage the object lifetime. > > > * get/put helpers should be used when accessing an instance > > > * outside of a lifetime-guarded section. In general, this > > > * is only needed for handling filters shared across tasks. > > > [...] > > > + * @live: Number of tasks that use this filter directly and number > > > + * of dependent filters that have a non-zero @live counter. > > > + * Altered during fork(), exit(), and filter installation > > > [...] > > > refcount_set(&sfilter->usage, 1); > > > + refcount_set(&sfilter->live, 1); > [...] > > After looking at these other lifetime management examples in the kernel, > > I'm convinced that tracking these states separately is correct, but I > > remain uncomfortable about task management needing to explicitly make > > two calls to let go of the filter. > > > > I wonder if release_task() should also detach the filter from the task > > and do a put_seccomp_filter() instead of waiting for task_free(). This > > is supported by the other place where seccomp_filter_release() is > > called: > > > > > @@ -396,6 +400,7 @@ static inline void seccomp_sync_threads(unsigned long flags) > > > * allows a put before the assignment.) > > > */ > > > put_seccomp_filter(thread); > > > + seccomp_filter_release(thread); > > > > This would also remove the only put_seccomp_filter() call outside of > > seccomp.c, since the free_task() call will be removed now in favor of > > the task_release() call. > > > > So, is it safe to detach the filter in release_task()? Has dethreading > > happened yet? i.e. can we race TSYNC? -- is there a possible > > inc-from-zero? > > release_task -> __exit_signal -> __unhash_process -> > list_del_rcu(&p->thread_node) drops us from the thread list under > siglock, which is the same lock TSYNC uses. We should move us after write_unlock_irq(&tasklist_lock). We're after __exit_signal() so we're unhashed and can't be discovered by tsync too anymore and we also don't require the tasklist_lock to be held: diff --git a/kernel/exit.c b/kernel/exit.c index b332e3635eb5..5490cc04f436 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -193,8 +193,6 @@ void release_task(struct task_struct *p) cgroup_release(p); write_lock_irq(&tasklist_lock); ptrace_release_task(p); thread_pid = get_pid(p->thread_pid); @@ -220,6 +218,7 @@ void release_task(struct task_struct *p) } write_unlock_irq(&tasklist_lock); + seccomp_filter_release(p); proc_flush_pid(thread_pid); put_pid(thread_pid); release_thread(p); Christian _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers