The patch titled Remove down_write() from taskstats code invoked on the exit() path has been added to the -mm tree. Its filename is remove-down_write-from-taskstats-code-invoked-on-the-exit-path.patch See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find out what to do about this ------------------------------------------------------ Subject: Remove down_write() from taskstats code invoked on the exit() path From: Shailabh Nagar <nagar@xxxxxxxxxxxxxx> In send_cpu_listeners(), which is called on the exit path, a down_write() was protecting operations like skb_clone() and genlmsg_unicast() that do GFP_KERNEL allocations. If the oom-killer decides to kill tasks to satisfy the allocations,the exit of those tasks could block on the same semphore. The down_write() was only needed to allow removal of invalid listeners from the listener list. The patch converts the down_write to a down_read and defers the removal to a separate critical region. This ensures that even if the oom-killer is called, no other task's exit is blocked as it can still acquire another down_read. Thanks to Andrew Morton & Herbert Xu for pointing out the oom related pitfalls, and to Chandra Seetharaman for suggesting this fix instead of using something more complex like RCU. Signed-off-by: Chandra Seetharaman <sekharan@xxxxxxxxxx> Signed-off-by: Shailabh Nagar <nagar@xxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxx> --- kernel/taskstats.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-) diff -puN kernel/taskstats.c~remove-down_write-from-taskstats-code-invoked-on-the-exit-path kernel/taskstats.c --- a/kernel/taskstats.c~remove-down_write-from-taskstats-code-invoked-on-the-exit-path +++ a/kernel/taskstats.c @@ -51,6 +51,7 @@ __read_mostly = { struct listener { struct list_head list; pid_t pid; + char valid; }; struct listener_list { @@ -127,7 +128,7 @@ static int send_cpu_listeners(struct sk_ struct listener *s, *tmp; struct sk_buff *skb_next, *skb_cur = skb; void *reply = genlmsg_data(genlhdr); - int rc, ret; + int rc, ret, delcount = 0; rc = genlmsg_end(skb, reply); if (rc < 0) { @@ -137,7 +138,7 @@ static int send_cpu_listeners(struct sk_ rc = 0; listeners = &per_cpu(listener_array, cpu); - down_write(&listeners->sem); + down_read(&listeners->sem); list_for_each_entry_safe(s, tmp, &listeners->list, list) { skb_next = NULL; if (!list_islast(&s->list, &listeners->list)) { @@ -150,14 +151,26 @@ static int send_cpu_listeners(struct sk_ } ret = genlmsg_unicast(skb_cur, s->pid); if (ret == -ECONNREFUSED) { - list_del(&s->list); - kfree(s); + s->valid = 0; + delcount++; rc = ret; } skb_cur = skb_next; } - up_write(&listeners->sem); + up_read(&listeners->sem); + + if (!delcount) + return rc; + /* Delete invalidated entries */ + down_write(&listeners->sem); + list_for_each_entry_safe(s, tmp, &listeners->list, list) { + if (!s->valid) { + list_del(&s->list); + kfree(s); + } + } + up_write(&listeners->sem); return rc; } @@ -290,6 +303,7 @@ static int add_del_listener(pid_t pid, c goto cleanup; s->pid = pid; INIT_LIST_HEAD(&s->list); + s->valid = 1; listeners = &per_cpu(listener_array, cpu); down_write(&listeners->sem); _ Patches currently in -mm which might be from nagar@xxxxxxxxxxxxxx are netlink-improve-string-attribute-validation.patch list_islast-utility.patch per-task-delay-accounting-setup.patch per-task-delay-accounting-sync-block-i-o-and-swapin-delay-collection.patch per-task-delay-accounting-cpu-delay-collection-via-schedstats.patch per-task-delay-accounting-utilities-for-genetlink-usage.patch per-task-delay-accounting-taskstats-interface.patch per-task-delay-accounting-taskstats-interface-fix.patch per-task-delay-accounting-delay-accounting-usage-of-taskstats-interface.patch per-task-delay-accounting-documentation.patch per-task-delay-accounting-proc-export-of-aggregated-block-i-o-delays.patch delay-accounting-taskstats-interface-send-tgid-once.patch per-task-delay-accounting-taskstats-interface-documentation-fix.patch per-task-delay-accounting-avoid-send-without-listeners.patch per-task-delay-accounting-taskstats-interface-control-exit-data-through-cpumasks.patch per-task-delay-accounting-taskstats-interface-control-exit-data-through-cpumasks-fix.patch per-task-delay-accounting-taskstats-interface-control-exit-data-through-cpumasks-fix-2.patch per-task-delay-accounting-taskstats-interface-control-exit-data-through-cpumasks-fix-3.patch per-task-delay-accounting-taskstats-interface-control-exit-data-through-cpumasks-fix-cleanup.patch per-task-delay-accounting-taskstats-interface-control-exit-data-through-cpumasks-fix-cleanup-fix.patch remove-down_write-from-taskstats-code-invoked-on-the-exit-path.patch task-watchers-register-per-task-delay-accounting.patch - To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html