On Tue, Apr 27, 2021 at 8:35 AM Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> wrote: > > On 26/04/2021 23.08, Florent Revest wrote: > > On Mon, Apr 26, 2021 at 6:19 PM Andrii Nakryiko > > <andrii.nakryiko@xxxxxxxxx> wrote: > >> > >> On Mon, Apr 26, 2021 at 3:10 AM Florent Revest <revest@xxxxxxxxxxxx> wrote: > >>> > >>> On Sat, Apr 24, 2021 at 12:38 AM Andrii Nakryiko > >>> <andrii.nakryiko@xxxxxxxxx> wrote: > >>>> > >>>> On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <revest@xxxxxxxxxxxx> wrote: > >>>>> > >>>>> The "positive" part tests all format specifiers when things go well. > >>>>> > >>>>> The "negative" part makes sure that incorrect format strings fail at > >>>>> load time. > >>>>> > >>>>> Signed-off-by: Florent Revest <revest@xxxxxxxxxxxx> > >>>>> --- > >>>>> .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++ > >>>>> .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++ > >>>>> .../bpf/progs/test_snprintf_single.c | 20 +++ > >>>>> 3 files changed, 218 insertions(+) > >>>>> create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c > >>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c > >>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c > >>>>> > >>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c > >>>>> new file mode 100644 > >>>>> index 000000000000..a958c22aec75 > >>>>> --- /dev/null > >>>>> +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c > >>>>> @@ -0,0 +1,125 @@ > >>>>> +// SPDX-License-Identifier: GPL-2.0 > >>>>> +/* Copyright (c) 2021 Google LLC. */ > >>>>> + > >>>>> +#include <test_progs.h> > >>>>> +#include "test_snprintf.skel.h" > >>>>> +#include "test_snprintf_single.skel.h" > >>>>> + > >>>>> +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00" > >>>>> +#define EXP_NUM_RET sizeof(EXP_NUM_OUT) > >>>>> + > >>>>> +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001" > >>>>> +#define EXP_IP_RET sizeof(EXP_IP_OUT) > >>>>> + > >>>>> +/* The third specifier, %pB, depends on compiler inlining so don't check it */ > >>>>> +#define EXP_SYM_OUT "schedule schedule+0x0/" > >>>>> +#define MIN_SYM_RET sizeof(EXP_SYM_OUT) > >>>>> + > >>>>> +/* The third specifier, %p, is a hashed pointer which changes on every reboot */ > >>>>> +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 " > >>>>> +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr") > >>>>> + > >>>>> +#define EXP_STR_OUT "str1 longstr" > >>>>> +#define EXP_STR_RET sizeof(EXP_STR_OUT) > >>>>> + > >>>>> +#define EXP_OVER_OUT "%over" > >>>>> +#define EXP_OVER_RET 10 > >>>>> + > >>>>> +#define EXP_PAD_OUT " 4 000" > >>>> > >>>> Roughly 50% of the time I get failure for this test case: > >>>> > >>>> test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4 > >>>> 0000' != expected ' 4 000' > >>>> > >>>> Re-running this test case immediately passes. Running again most > >>>> probably fails. Please take a look. > >>> > >>> Do you have more information on how to reproduce this ? > >>> I spinned up a VM at 87bd9e602 with ./vmtest -s and then run this script: > >>> > >>> #!/bin/sh > >>> for i in `seq 1000` > >>> do > >>> ./test_progs -t snprintf > >>> if [ $? -ne 0 ]; > >>> then > >>> echo FAILURE > >>> exit 1 > >>> fi > >>> done > >>> > >>> The thousand executions passed. > >>> > >>> This is a bit concerning because your unexpected_pad_out seems to have > >>> an extra '0' so it ends up with strlen(pad_out)=11 but > >>> sizeof(pad_out)=10. The actual string writing is not really done by > >>> our helper code but by the snprintf implementation (str and str_size > >>> are only given to snprintf()) so I'd expect the truncation to work > >>> well there. I'm a bit puzzled > >> > >> I'm puzzled too, have no idea. I also can't repro this with vmtest.sh. > >> But I can quite reliably reproduce with my local ArchLinux-based qemu > >> image with different config (see [0] for config itself). So please try > >> with my config and see if that helps to repro. If not, I'll have to > >> debug it on my own later. > >> > >> [0] https://gist.github.com/anakryiko/4b6ae21680842bdeacca8fa99d378048 > > > > I tried that config on the same commit 87bd9e602 (bpf-next/master) > > with my debian-based qemu image and I still can't reproduce the issue > > :| If I can be of any help let me know, I'd be happy to help > > > > It's not really clear to me if this is before or after the rewrite to > use bprintf, but regardless, in those two patches this caught my attention: I tried to reproduce Andrii's bug both before and after the bprintf rewrite but I think he meant before. > u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 }; > - enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS]; > + u32 *bin_args; > static char buf[BPF_TRACE_PRINTK_SIZE]; > unsigned long flags; > int ret; > > - ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod, > - MAX_TRACE_PRINTK_VARARGS); > + ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args, > + MAX_TRACE_PRINTK_VARARGS); > if (ret < 0) > return ret; > > - ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod), > - BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod)); > - /* snprintf() will not append null for zero-length strings */ > - if (ret == 0) > - buf[0] = '\0'; > + ret = bstr_printf(buf, sizeof(buf), fmt, bin_args); > > raw_spin_lock_irqsave(&trace_printk_lock, flags); > trace_bpf_trace_printk(buf); > raw_spin_unlock_irqrestore(&trace_printk_lock, flags); > > Why isn't the write to buf[] protected by that spinlock? Or put another > way, what protects buf[] from concurrent writes? You're right, that is a bug, I missed that buf was static and thought it was just on the stack. That snprintf call should be after the raw_spin_lock_irqsave. I'll send a patch. Thank you Rasmus. (before my snprintf series, there was a vsprintf after the raw_spin_lock_irqsave) > Probably the test cases are not run in parallel, but this is the kind of > thing that would give those symptoms. I think it's a separate issue from what Andrii reported though because the flaky test exercises the bpf_snprintf helper and this buf spinlock bug you just found only affects the bpf_trace_printk helper. That being said, it does smell a little bit like a concurrency issue too, indeed. The bpf_snprintf test program is a raw_tp/sys_enter so it attaches to all syscall entries and most likely gets executed many more times than necessary and probably on parallel CPUs. The "pad_out" buffer they write to is unique and not locked so maybe the test's userspace reads pad_out while another CPU is writing on it and if the string output goes through a stage where it is " 4 0000" before being " 4 000", we might read at the wrong time. That being said, I would find it weird that this happens as much as 50% of the time and always specifically on that test case. Andrii could you maybe try changing the prog type to "tp/syscalls/sys_enter_nanosleep" on the machine where you can reproduce this bug ? > Rasmus