[Let's add linux-api - please always cc this list when adding/modifying user visible interfaces] On Tue 28-03-17 15:01:28, Dmitry Vyukov wrote: > Add /proc/self/task/<current-tid>/fail-nth file that allows failing > 0-th, 1-st, 2-nd and so on calls systematically. > Excerpt from the added documentation: I didn't really get to read through details here but it just feels wrong to add this debugging only feature into proc. It also smells like one off thing as well. > === > Write to this file of integer N makes N-th call in the current task fail > (N is 0-based). Read from this file returns a single char 'Y' or 'N' > that says if the fault setup with a previous write to this file was > injected or not, and disables the fault if it wasn't yet injected. > Note that this file enables all types of faults (slab, futex, etc). > This setting takes precedence over all other generic settings like > probability, interval, times, etc. But per-capability settings > (e.g. fail_futex/ignore-private) take precedence over it. > This feature is intended for systematic testing of faults in a single > system call. See an example below. > === > > Why adding new setting: > 1. Existing settings are global rather than per-task. > So parallel testing is not possible. > 2. attr->interval is close but it depends on attr->count > which is non reset to 0, so interval does not work as expected. > 3. Trying to model this with existing settings requires manipulations > of all of probability, interval, times, space, task-filter and > unexposed count and per-task make-it-fail files. > 4. Existing settings are per-failure-type, and the set of failure > types is potentially expanding. > 5. make-it-fail can't be changed by unprivileged user and aggressive > stress testing better be done from an unprivileged user. > Similarly, this would require opening the debugfs files to the > unprivileged user, as he would need to reopen at least times file > (not possible to pre-open before dropping privs). > > The proposed interface solves all of the above (see the example). > > Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> > Cc: Akinobu Mita <akinobu.mita@xxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: linux-kernel@xxxxxxxxxxxxxxx > Cc: linux-mm@xxxxxxxxx > > --- > We want to integrate this into syzkaller fuzzer. > A prototype has found 10 bugs in kernel in first day of usage: > https://groups.google.com/forum/#!searchin/syzkaller/%22FAULT_INJECTION%22%7Csort:relevance > > Changes since v1: > - change file name from /sys/kernel/debug/fail_once > to /proc/self/task/<current-tid>/fail-nth as per > Akinobu suggestion > > --- > Documentation/fault-injection/fault-injection.txt | 78 +++++++++++++++++++++++ > fs/proc/base.c | 52 +++++++++++++++ > include/linux/sched.h | 1 + > kernel/fork.c | 4 ++ > lib/fault-inject.c | 7 ++ > 5 files changed, 142 insertions(+) > > diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt > index 415484f3d59a..192d8cbcc5f9 100644 > --- a/Documentation/fault-injection/fault-injection.txt > +++ b/Documentation/fault-injection/fault-injection.txt > @@ -134,6 +134,22 @@ use the boot option: > fail_futex= > mmc_core.fail_request=<interval>,<probability>,<space>,<times> > > +o proc entries > + > +- /proc/self/task/<current-tid>/fail-nth: > + > + Write to this file of integer N makes N-th call in the current task fail > + (N is 0-based). Read from this file returns a single char 'Y' or 'N' > + that says if the fault setup with a previous write to this file was > + injected or not, and disables the fault if it wasn't yet injected. > + Note that this file enables all types of faults (slab, futex, etc). > + This setting takes precedence over all other generic debugfs settings > + like probability, interval, times, etc. But per-capability settings > + (e.g. fail_futex/ignore-private) take precedence over it. > + > + This feature is intended for systematic testing of faults in a single > + system call. See an example below. > + > How to add new fault injection capability > ----------------------------------------- > > @@ -278,3 +294,65 @@ allocation failure. > # env FAILCMD_TYPE=fail_page_alloc \ > ./tools/testing/fault-injection/failcmd.sh --times=100 \ > -- make -C tools/testing/selftests/ run_tests > + > +Systematic faults using fail-nth > +--------------------------------- > + > +The following code systematically faults 0-th, 1-st, 2-nd and so on > +capabilities in the socketpair() system call. > + > +#include <sys/types.h> > +#include <sys/stat.h> > +#include <sys/socket.h> > +#include <sys/syscall.h> > +#include <fcntl.h> > +#include <unistd.h> > +#include <string.h> > +#include <stdlib.h> > +#include <stdio.h> > +#include <errno.h> > + > +int main() > +{ > + int i, err, res, fail_nth, fds[2]; > + char buf[128]; > + > + system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait"); > + sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid)); > + fail_nth = open(buf, O_RDWR); > + for (i = 0;; i++) { > + sprintf(buf, "%d", i); > + write(fail_nth, buf, strlen(buf)); > + res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds); > + err = errno; > + read(fail_nth, buf, 1); > + if (res == 0) { > + close(fds[0]); > + close(fds[1]); > + } > + printf("%d-th fault %c: res=%d/%d\n", i, buf[0], res, err); > + if (buf[0] != 'Y') > + break; > + } > + return 0; > +} > + > +An example output: > + > +0-th fault Y: res=-1/23 > +1-th fault Y: res=-1/23 > +2-th fault Y: res=-1/23 > +3-th fault Y: res=-1/12 > +4-th fault Y: res=-1/12 > +5-th fault Y: res=-1/23 > +6-th fault Y: res=-1/23 > +7-th fault Y: res=-1/23 > +8-th fault Y: res=-1/12 > +9-th fault Y: res=-1/12 > +10-th fault Y: res=-1/12 > +11-th fault Y: res=-1/12 > +12-th fault Y: res=-1/12 > +13-th fault Y: res=-1/12 > +14-th fault Y: res=-1/12 > +15-th fault Y: res=-1/12 > +16-th fault N: res=0/12 > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 6e8655845830..66001172249b 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -1353,6 +1353,53 @@ static const struct file_operations proc_fault_inject_operations = { > .write = proc_fault_inject_write, > .llseek = generic_file_llseek, > }; > + > +static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf, > + size_t count, loff_t *ppos) > +{ > + struct task_struct *task; > + int err, n; > + > + task = get_proc_task(file_inode(file)); > + if (!task) > + return -ESRCH; > + put_task_struct(task); > + if (task != current) > + return -EPERM; > + err = kstrtoint_from_user(buf, count, 10, &n); > + if (err) > + return err; > + if (n < 0 || n == INT_MAX) > + return -EINVAL; > + current->fail_nth = n + 1; > + return len; > +} > + > +static ssize_t proc_fail_nth_read(struct file *file, char __user *buf, > + size_t count, loff_t *ppos) > +{ > + struct task_struct *task; > + int err; > + > + task = get_proc_task(file_inode(file)); > + if (!task) > + return -ESRCH; > + put_task_struct(task); > + if (task != current) > + return -EPERM; > + if (count < 1) > + return -EINVAL; > + err = put_user((char)(current->fail_nth ? 'N' : 'Y'), buf); > + if (err) > + return err; > + current->fail_nth = 0; > + return 1; > +} > + > +static const struct file_operations proc_fail_nth_operations = { > + .read = proc_fail_nth_read, > + .write = proc_fail_nth_write, > +}; > #endif > > > @@ -3296,6 +3343,11 @@ static const struct pid_entry tid_base_stuff[] = { > #endif > #ifdef CONFIG_FAULT_INJECTION > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations), > + /* > + * Operations on the file check that the task is current, > + * so we create it with 0666 to support testing under unprivileged user. > + */ > + REG("fail-nth", 0666, proc_fail_nth_operations), > #endif > #ifdef CONFIG_TASK_IO_ACCOUNTING > ONE("io", S_IRUSR, proc_tid_io_accounting), > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 543e0ea82684..7b50221fea51 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1897,6 +1897,7 @@ struct task_struct { > #endif > #ifdef CONFIG_FAULT_INJECTION > int make_it_fail; > + int fail_nth; > #endif > /* > * when (nr_dirtied >= nr_dirtied_pause), it's time to call > diff --git a/kernel/fork.c b/kernel/fork.c > index 61284d8122fa..869c97a0a930 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -545,6 +545,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) > > kcov_task_init(tsk); > > +#ifdef CONFIG_FAULT_INJECTION > + tsk->fail_nth = 0; > +#endif > + > return tsk; > > free_stack: > diff --git a/lib/fault-inject.c b/lib/fault-inject.c > index 6a823a53e357..d6516ba64d33 100644 > --- a/lib/fault-inject.c > +++ b/lib/fault-inject.c > @@ -107,6 +107,12 @@ static inline bool fail_stacktrace(struct fault_attr *attr) > > bool should_fail(struct fault_attr *attr, ssize_t size) > { > + if (in_task() && current->fail_nth) { > + if (--current->fail_nth == 0) > + goto fail; > + return false; > + } > + > /* No need to check any other properties if the probability is 0 */ > if (attr->probability == 0) > return false; > @@ -134,6 +140,7 @@ bool should_fail(struct fault_attr *attr, ssize_t size) > if (!fail_stacktrace(attr)) > return false; > > +fail: > fail_dump(attr); > > if (atomic_read(&attr->times) != -1) > -- > 2.12.2.564.g063fe858b8-goog > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html