The patch titled proc_flush_task: flush /proc/tid/task/pid when a sub-thread exits has been added to the -mm tree. Its filename is proc_flush_task-flush-proc-tid-task-pid-when-a-sub-thread-exits.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** See http://userweb.kernel.org/~akpm/stuff/added-to-mm.txt to find out what to do about this The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ ------------------------------------------------------ Subject: proc_flush_task: flush /proc/tid/task/pid when a sub-thread exits From: Oleg Nesterov <oleg@xxxxxxxxxx> The exiting sub-thread flushes /proc/pid only, but this doesn't buy too much: ps and friends mostly use /proc/tid/task/pid. Remove "if (thread_group_leader())" checks from proc_flush_task() path, this means we always remove /proc/tid/task/pid dentry on exit, and this actually matches the comment above proc_flush_task(). The test-case: static void* tfunc(void *arg) { char name[256]; sprintf(name, "/proc/%d/task/%ld/status", getpid(), gettid()); close(open(name, O_RDONLY)); return NULL; } int main(void) { pthread_t t; for (;;) { if (!pthread_create(&t, NULL, &tfunc, NULL)) pthread_join(t, NULL); } } slabtop shows that pid/proc_inode_cache/etc grow quickly and "indefinitely" until the task is killed or shrink_slab() is called, not good. And the main thread needs a lot of time to exit. The same can happen if something like "ps -efL" runs continuously, while some application spawns short-living threads. Reported-by: "James M. Leddy" <jleddy@xxxxxxxxxx> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx> Cc: Alexey Dobriyan <adobriyan@xxxxxxxxx> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> Cc: Dominic Duval <dduval@xxxxxxxxxx> Cc: Frank Hirtz <fhirtz@xxxxxxxxxx> Cc: "Fuller, Johnray" <Johnray.Fuller@xxxxxx> Cc: Larry Woodman <lwoodman@xxxxxxxxxx> Cc: Paul Batkowski <pbatkowski@xxxxxxxxxx> Cc: Roland McGrath <roland@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- fs/proc/base.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff -puN fs/proc/base.c~proc_flush_task-flush-proc-tid-task-pid-when-a-sub-thread-exits fs/proc/base.c --- a/fs/proc/base.c~proc_flush_task-flush-proc-tid-task-pid-when-a-sub-thread-exits +++ a/fs/proc/base.c @@ -2601,9 +2601,6 @@ static void proc_flush_task_mnt(struct v dput(dentry); } - if (tgid == 0) - goto out; - name.name = buf; name.len = snprintf(buf, sizeof(buf), "%d", tgid); leader = d_hash_and_lookup(mnt->mnt_root, &name); @@ -2660,17 +2657,16 @@ out: void proc_flush_task(struct task_struct *task) { int i; - struct pid *pid, *tgid = NULL; + struct pid *pid, *tgid; struct upid *upid; pid = task_pid(task); - if (thread_group_leader(task)) - tgid = task_tgid(task); + tgid = task_tgid(task); for (i = 0; i <= pid->level; i++) { upid = &pid->numbers[i]; proc_flush_task_mnt(upid->ns->proc_mnt, upid->nr, - tgid ? tgid->numbers[i].nr : 0); + tgid->numbers[i].nr); } upid = &pid->numbers[pid->level]; _ Patches currently in -mm which might be from oleg@xxxxxxxxxx are linux-next.patch getrusage-fill-ru_maxrss-value.patch getrusage-fill-ru_maxrss-value-update.patch proc_flush_task-flush-proc-tid-task-pid-when-a-sub-thread-exits.patch ptrace-__ptrace_detach-do-__wake_up_parent-if-we-reap-the-tracee.patch do_wait-wakeup-optimization-shift-security_task_wait-from-eligible_child-to-wait_consider_task.patch do_wait-wakeup-optimization-change-__wake_up_parent-to-use-filtered-wakeup.patch do_wait-wakeup-optimization-change-__wake_up_parent-to-use-filtered-wakeup-selinux_bprm_committed_creds-use-__wake_up_parent.patch do_wait-wakeup-optimization-child_wait_callback-check-__wnothread-case.patch do_wait-optimization-do-not-place-sub-threads-on-task_struct-children-list.patch wait_consider_task-kill-parent-argument.patch do_wait-fix-sys_waitid-specific-behaviour.patch wait_noreap_copyout-check-for-wo_info-=-null.patch signals-tracehook_notify_jctl-change.patch utrace-core.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html