This is a note to let you know that I've just added the patch titled tracing: Introduce trace_create_cpu_file() and to the 3.10-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: tracing-introduce-trace_create_cpu_file-and.patch and it can be found in the queue-3.10 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 649e9c70da6bfbeb563193a35d3424a5aa7c0d38 Mon Sep 17 00:00:00 2001 From: Oleg Nesterov <oleg@xxxxxxxxxx> Date: Tue, 23 Jul 2013 17:25:54 +0200 Subject: tracing: Introduce trace_create_cpu_file() and tracing_get_cpu() From: Oleg Nesterov <oleg@xxxxxxxxxx> commit 649e9c70da6bfbeb563193a35d3424a5aa7c0d38 upstream. Every "file_operations" used by tracing_init_debugfs_percpu is buggy. f_op->open/etc does: 1. struct trace_cpu *tc = inode->i_private; struct trace_array *tr = tc->tr; 2. trace_array_get(tr) or fail; 3. do_something(tc); But tc (and tr) can be already freed before trace_array_get() is called. And it doesn't matter whether this file is per-cpu or it was created by init_tracer_debugfs(), free_percpu() or kfree() are equally bad. Note that even 1. is not safe, the freed memory can be unmapped. But even if it was safe trace_array_get() can wrongly succeed if we also race with the next new_instance_create() which can re-allocate the same tr, or tc was overwritten and ->tr points to the valid tr. In this case 3. uses the freed/reused memory. Add the new trivial helper, trace_create_cpu_file() which simply calls trace_create_file() and encodes "cpu" in "struct inode". Another helper, tracing_get_cpu() will be used to read cpu_nr-or-RING_BUFFER_ALL_CPUS. The patch abuses ->i_cdev to encode the number, it is never used unless the file is S_ISCHR(). But we could use something else, say, i_bytes or even ->d_fsdata. In any case this hack is hidden inside these 2 helpers, it would be trivial to change them if needed. This patch only changes tracing_init_debugfs_percpu() to use the new trace_create_cpu_file(), the next patches will change file_operations. Note: tracing_get_cpu(inode) is always safe but you can't trust the result unless trace_array_get() was called, without trace_types_lock which acts as a barrier it can wrongly return RING_BUFFER_ALL_CPUS. Link: http://lkml.kernel.org/r/20130723152554.GA23710@xxxxxxxxxx Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx> Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- kernel/trace/trace.c | 50 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 36 insertions(+), 14 deletions(-) --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2834,6 +2834,17 @@ static int s_show(struct seq_file *m, vo return 0; } +/* + * Should be used after trace_array_get(), trace_types_lock + * ensures that i_cdev was already initialized. + */ +static inline int tracing_get_cpu(struct inode *inode) +{ + if (inode->i_cdev) /* See trace_create_cpu_file() */ + return (long)inode->i_cdev - 1; + return RING_BUFFER_ALL_CPUS; +} + static const struct seq_operations tracer_seq_ops = { .start = s_start, .next = s_next, @@ -5521,6 +5532,17 @@ static struct dentry *tracing_dentry_per return tr->percpu_dir; } +static struct dentry * +trace_create_cpu_file(const char *name, umode_t mode, struct dentry *parent, + void *data, long cpu, const struct file_operations *fops) +{ + struct dentry *ret = trace_create_file(name, mode, parent, data, fops); + + if (ret) /* See tracing_get_cpu() */ + ret->d_inode->i_cdev = (void *)(cpu + 1); + return ret; +} + static void tracing_init_debugfs_percpu(struct trace_array *tr, long cpu) { @@ -5540,28 +5562,28 @@ tracing_init_debugfs_percpu(struct trace } /* per cpu trace_pipe */ - trace_create_file("trace_pipe", 0444, d_cpu, - (void *)&data->trace_cpu, &tracing_pipe_fops); + trace_create_cpu_file("trace_pipe", 0444, d_cpu, + &data->trace_cpu, cpu, &tracing_pipe_fops); /* per cpu trace */ - trace_create_file("trace", 0644, d_cpu, - (void *)&data->trace_cpu, &tracing_fops); + trace_create_cpu_file("trace", 0644, d_cpu, + &data->trace_cpu, cpu, &tracing_fops); - trace_create_file("trace_pipe_raw", 0444, d_cpu, - (void *)&data->trace_cpu, &tracing_buffers_fops); + trace_create_cpu_file("trace_pipe_raw", 0444, d_cpu, + &data->trace_cpu, cpu, &tracing_buffers_fops); - trace_create_file("stats", 0444, d_cpu, - (void *)&data->trace_cpu, &tracing_stats_fops); + trace_create_cpu_file("stats", 0444, d_cpu, + &data->trace_cpu, cpu, &tracing_stats_fops); - trace_create_file("buffer_size_kb", 0444, d_cpu, - (void *)&data->trace_cpu, &tracing_entries_fops); + trace_create_cpu_file("buffer_size_kb", 0444, d_cpu, + &data->trace_cpu, cpu, &tracing_entries_fops); #ifdef CONFIG_TRACER_SNAPSHOT - trace_create_file("snapshot", 0644, d_cpu, - (void *)&data->trace_cpu, &snapshot_fops); + trace_create_cpu_file("snapshot", 0644, d_cpu, + &data->trace_cpu, cpu, &snapshot_fops); - trace_create_file("snapshot_raw", 0444, d_cpu, - (void *)&data->trace_cpu, &snapshot_raw_fops); + trace_create_cpu_file("snapshot_raw", 0444, d_cpu, + &data->trace_cpu, cpu, &snapshot_raw_fops); #endif } Patches currently in stable-queue which might be from oleg@xxxxxxxxxx are queue-3.10/tracing-introduce-trace_create_cpu_file-and.patch queue-3.10/tracing-change-tracing_fops-snapshot_fops-to-rely-on-tracing_get_cpu.patch queue-3.10/tracing-change-tracing_buffers_fops-to-rely-on-tracing_get_cpu.patch queue-3.10/tracing-change-tracing_stats_fops-to-rely-on.patch queue-3.10/tracing-change-tracing_pipe_fops-to-rely-on-tracing_get_cpu.patch queue-3.10/tracing-change-tracing_entries_fops-to-rely-on-tracing_get_cpu.patch queue-3.10/tracing-turn-event-id-i_private-into-call-event.type.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html