On Tue, Apr 27, 2021 at 2:51 AM Florent Revest <revest@xxxxxxxxxxxx> wrote: > > On Tue, Apr 27, 2021 at 8:35 AM Rasmus Villemoes > <linux@xxxxxxxxxxxxxxxxxx> wrote: > > > > On 26/04/2021 23.08, Florent Revest wrote: > > > On Mon, Apr 26, 2021 at 6:19 PM Andrii Nakryiko > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > >> > > >> On Mon, Apr 26, 2021 at 3:10 AM Florent Revest <revest@xxxxxxxxxxxx> wrote: > > >>> > > >>> On Sat, Apr 24, 2021 at 12:38 AM Andrii Nakryiko > > >>> <andrii.nakryiko@xxxxxxxxx> wrote: > > >>>> > > >>>> On Mon, Apr 19, 2021 at 8:52 AM Florent Revest <revest@xxxxxxxxxxxx> wrote: > > >>>>> > > >>>>> The "positive" part tests all format specifiers when things go well. > > >>>>> > > >>>>> The "negative" part makes sure that incorrect format strings fail at > > >>>>> load time. > > >>>>> > > >>>>> Signed-off-by: Florent Revest <revest@xxxxxxxxxxxx> > > >>>>> --- > > >>>>> .../selftests/bpf/prog_tests/snprintf.c | 125 ++++++++++++++++++ > > >>>>> .../selftests/bpf/progs/test_snprintf.c | 73 ++++++++++ > > >>>>> .../bpf/progs/test_snprintf_single.c | 20 +++ > > >>>>> 3 files changed, 218 insertions(+) > > >>>>> create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c > > >>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c > > >>>>> create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf_single.c > > >>>>> > > >>>>> diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c > > >>>>> new file mode 100644 > > >>>>> index 000000000000..a958c22aec75 > > >>>>> --- /dev/null > > >>>>> +++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c > > >>>>> @@ -0,0 +1,125 @@ > > >>>>> +// SPDX-License-Identifier: GPL-2.0 > > >>>>> +/* Copyright (c) 2021 Google LLC. */ > > >>>>> + > > >>>>> +#include <test_progs.h> > > >>>>> +#include "test_snprintf.skel.h" > > >>>>> +#include "test_snprintf_single.skel.h" > > >>>>> + > > >>>>> +#define EXP_NUM_OUT "-8 9 96 -424242 1337 DABBAD00" > > >>>>> +#define EXP_NUM_RET sizeof(EXP_NUM_OUT) > > >>>>> + > > >>>>> +#define EXP_IP_OUT "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001" > > >>>>> +#define EXP_IP_RET sizeof(EXP_IP_OUT) > > >>>>> + > > >>>>> +/* The third specifier, %pB, depends on compiler inlining so don't check it */ > > >>>>> +#define EXP_SYM_OUT "schedule schedule+0x0/" > > >>>>> +#define MIN_SYM_RET sizeof(EXP_SYM_OUT) > > >>>>> + > > >>>>> +/* The third specifier, %p, is a hashed pointer which changes on every reboot */ > > >>>>> +#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 " > > >>>>> +#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr") > > >>>>> + > > >>>>> +#define EXP_STR_OUT "str1 longstr" > > >>>>> +#define EXP_STR_RET sizeof(EXP_STR_OUT) > > >>>>> + > > >>>>> +#define EXP_OVER_OUT "%over" > > >>>>> +#define EXP_OVER_RET 10 > > >>>>> + > > >>>>> +#define EXP_PAD_OUT " 4 000" > > >>>> > > >>>> Roughly 50% of the time I get failure for this test case: > > >>>> > > >>>> test_snprintf_positive:FAIL:pad_out unexpected pad_out: actual ' 4 > > >>>> 0000' != expected ' 4 000' > > >>>> > > >>>> Re-running this test case immediately passes. Running again most > > >>>> probably fails. Please take a look. > > >>> > > >>> Do you have more information on how to reproduce this ? > > >>> I spinned up a VM at 87bd9e602 with ./vmtest -s and then run this script: > > >>> > > >>> #!/bin/sh > > >>> for i in `seq 1000` > > >>> do > > >>> ./test_progs -t snprintf > > >>> if [ $? -ne 0 ]; > > >>> then > > >>> echo FAILURE > > >>> exit 1 > > >>> fi > > >>> done > > >>> > > >>> The thousand executions passed. > > >>> > > >>> This is a bit concerning because your unexpected_pad_out seems to have > > >>> an extra '0' so it ends up with strlen(pad_out)=11 but > > >>> sizeof(pad_out)=10. The actual string writing is not really done by > > >>> our helper code but by the snprintf implementation (str and str_size > > >>> are only given to snprintf()) so I'd expect the truncation to work > > >>> well there. I'm a bit puzzled > > >> > > >> I'm puzzled too, have no idea. I also can't repro this with vmtest.sh. > > >> But I can quite reliably reproduce with my local ArchLinux-based qemu > > >> image with different config (see [0] for config itself). So please try > > >> with my config and see if that helps to repro. If not, I'll have to > > >> debug it on my own later. > > >> > > >> [0] https://gist.github.com/anakryiko/4b6ae21680842bdeacca8fa99d378048 > > > > > > I tried that config on the same commit 87bd9e602 (bpf-next/master) > > > with my debian-based qemu image and I still can't reproduce the issue > > > :| If I can be of any help let me know, I'd be happy to help > > > > > > > It's not really clear to me if this is before or after the rewrite to > > use bprintf, but regardless, in those two patches this caught my attention: > > I tried to reproduce Andrii's bug both before and after the bprintf > rewrite but I think he meant before. I'm running on the latest bpf-next master, but I don't think it's related to bprintf change. > > > u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 }; > > - enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS]; > > + u32 *bin_args; > > static char buf[BPF_TRACE_PRINTK_SIZE]; > > unsigned long flags; > > int ret; > > > > - ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod, > > - MAX_TRACE_PRINTK_VARARGS); > > + ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args, > > + MAX_TRACE_PRINTK_VARARGS); > > if (ret < 0) > > return ret; > > > > - ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod), > > - BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod)); > > - /* snprintf() will not append null for zero-length strings */ > > - if (ret == 0) > > - buf[0] = '\0'; > > + ret = bstr_printf(buf, sizeof(buf), fmt, bin_args); > > > > raw_spin_lock_irqsave(&trace_printk_lock, flags); > > trace_bpf_trace_printk(buf); > > raw_spin_unlock_irqrestore(&trace_printk_lock, flags); > > > > Why isn't the write to buf[] protected by that spinlock? Or put another > > way, what protects buf[] from concurrent writes? > > You're right, that is a bug, I missed that buf was static and thought > it was just on the stack. That snprintf call should be after the > raw_spin_lock_irqsave. I'll send a patch. Thank you Rasmus. (before my > snprintf series, there was a vsprintf after the raw_spin_lock_irqsave) Can you please also clean up unnecessary ()s you added in at least a few places. Thanks. > > > Probably the test cases are not run in parallel, but this is the kind of > > thing that would give those symptoms. > > I think it's a separate issue from what Andrii reported though because > the flaky test exercises the bpf_snprintf helper and this buf spinlock > bug you just found only affects the bpf_trace_printk helper. > > That being said, it does smell a little bit like a concurrency issue > too, indeed. The bpf_snprintf test program is a raw_tp/sys_enter so it > attaches to all syscall entries and most likely gets executed many > more times than necessary and probably on parallel CPUs. The "pad_out" > buffer they write to is unique and not locked so maybe the test's > userspace reads pad_out while another CPU is writing on it and if the > string output goes through a stage where it is " 4 0000" before > being " 4 000", we might read at the wrong time. That being said, I > would find it weird that this happens as much as 50% of the time and > always specifically on that test case. > > Andrii could you maybe try changing the prog type to > "tp/syscalls/sys_enter_nanosleep" on the machine where you can > reproduce this bug ? Yes, it helps. I can't repro it easily anymore. I think the right fix, though, should be to filter by tid, not change the tracepoint. I think what's happening is we see the string right before bstr_printf does zero-termination with end[-1] = '\0'; So in some cases we see truncated string, in others we see untruncated one. > > > Rasmus