On Thu, Oct 1, 2020 at 2:06 PM YiFei Zhu <zhuyifei1999@xxxxxxxxx> wrote: > On Wed, Sep 30, 2020 at 5:01 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > Hmm, this won't work, because the task could be exiting, and seccomp > > filters are detached in release_task() (using > > seccomp_filter_release()). And at the moment, seccomp_filter_release() > > just locklessly NULLs out the tsk->seccomp.filter pointer and drops > > the reference. > > > > The locking here is kind of gross, but basically I think you can > > change this code to use lock_task_sighand() / unlock_task_sighand() > > (see the other examples in fs/proc/base.c), and bail out if > > lock_task_sighand() returns NULL. And in seccomp_filter_release(), add > > something like this: > > > > /* We are effectively holding the siglock by not having any sighand. */ > > WARN_ON(tsk->sighand != NULL); > > Ah thanks. I was thinking about how tasks exit and get freed and that > sort of stuff, and how this would race against them. The last time I > worked with procfs there was some magic going on that I could not > figure out, so I was thinking if some magic will stop the task_struct > from being released, considering it's an argument here. > > I just looked at release_task and related functions; looks like it > will, at the end, decrease the reference count of the task_struct. > Does procfs increase the refcount while calling the procfs functions? > Hence, in procfs functions one can rely on the task_struct still being > a task_struct, but any direct effects of release_task may happen while > the procfs functions are running? Yeah. The ONE() entry you're adding to tgid_base_stuff is used to help instantiate a "struct inode" when someone looks up the path "/proc/$tgid/seccomp_cache"; then when that path is opened, a "struct file" is created that holds a reference to the inode; and while that file exists, your proc_pid_seccomp_cache() can be invoked. proc_pid_seccomp_cache() is invoked from proc_single_show() ("PROC_I(inode)->op.proc_show" is proc_pid_seccomp_cache), and proc_single_show() obtains a temporary reference to the task_struct using get_pid_task() on a "struct pid" and drops that reference afterwards with put_task_struct(). The "struct pid" is obtained from the "struct proc_inode", which is essentially a subclass of "struct inode". The "struct pid" is kept refererenced until the inode goes away, via proc_pid_evict_inode(), called by proc_evict_inode(). By looking at put_task_struct() and its callees, you can figure out which parts of the "struct task" are kept alive by the reference to it. By the way, maybe it'd make sense to add this to tid_base_stuff as well? That should just be one extra line of code. Seccomp filters are technically per-thread, so it would make sense to have them visible in the per-thread subdirectories /proc/$pid/task/$tid/.