Re: [PATCH] bpf: remove pointless code from bpf_do_trace_printk()

Florent Revest <revest@xxxxxxxxxxxx> · Thu, 22 Apr 2021 11:23:17 +0200

On Thu, Apr 22, 2021 at 9:13 AM Rasmus Villemoes
<linux@xxxxxxxxxxxxxxxxxx> wrote:
>
> On 22/04/2021 05.32, Andrii Nakryiko wrote:
> > On Wed, Apr 21, 2021 at 6:19 PM Rasmus Villemoes
> > <linux@xxxxxxxxxxxxxxxxxx> wrote:
> >>
> >> The comment is wrong. snprintf(buf, 16, "") and snprintf(buf, 16,
> >> "%s", "") etc. will certainly put '\0' in buf[0]. The only case where
> >> snprintf() does not guarantee a nul-terminated string is when it is
> >> given a buffer size of 0 (which of course prevents it from writing
> >> anything at all to the buffer).
> >>
> >> Remove it before it gets cargo-culted elsewhere.
> >>
> >> Signed-off-by: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>
> >> ---
> >>  kernel/trace/bpf_trace.c | 3 ---
> >>  1 file changed, 3 deletions(-)
> >>
> >
> > The change looks good to me, but please rebase it on top of the
> > bpf-next tree. This is not a bug, so it doesn't have to go into the
> > bpf tree. As it is right now, it doesn't apply cleanly onto bpf-next.

FWIW the idea of the patch also looks good to me :)

> Thanks for the pointer. Looking in next-20210420, it seems to me that
>
> commit d9c9e4db186ab4d81f84e6f22b225d333b9424e3
> Author: Florent Revest <revest@xxxxxxxxxxxx>
> Date:   Mon Apr 19 17:52:38 2021 +0200
>
>     bpf: Factorize bpf_trace_printk and bpf_seq_printf
>
> is buggy. In particular, these two snippets:
>
> +#define BPF_CAST_FMT_ARG(arg_nb, args, mod)                            \
> +       (mod[arg_nb] == BPF_PRINTF_LONG_LONG ||                         \
> +        (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64)      \
> +         ? (u64)args[arg_nb]                                           \
> +         : (u32)args[arg_nb])
>
>
> +       ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args,
> mod),
> +               BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2,
> args, mod));
>
> Regardless of the casts done in that macro, the type of the resulting
> expression is that resulting from C promotion rules. And (foo ? (u64)bla
> : (u32)blib) has type u64, which is thus the type the compiler uses when
> building the vararg list being passed into snprintf(). C simply doesn't
> allow you to change types at run-time in this way.
>
> It probably works fine on x86-64, which passes the first six or so
> argument in registers, va_start() puts those registers into the va_list
> opaque structure, and when it comes time to do a va_arg(int), just the
> lower 32 bits are used. It is broken on i386 and other architectures
> where arguments are passed on the stack (and for x86-64 as well had
> there been a few more arguments) and va_arg(ap, int) is essentially ({
> int res = *(int *)ap; ap += 4; res; }) [or maybe it's -= 4 because stack
> direction etc., that's not really relevant here].
>
> Rasmus

Thank you Rasmus :)

It seems that we went offtrack in
https://lore.kernel.org/bpf/CAEf4BzZVEGM4esi-Rz67_xX_RTDrgxViy0gHfpeauECR5bmRNA@xxxxxxxxxxxxxx/
and we do need something like "88a5c690b6 bpf: fix bpf_trace_printk on
32 bit archs". Thinking about it again, it's clearer now why the
__BPF_TP_EMIT macro emits 2^3=8 different __trace_printk() indeed.

In the case of bpf_trace_printk with a maximum of 3 args, it's
relatively cheap; but for bpf_seq_printf and bpf_snprintf which accept
up to 12 arguments, that would be 2^12=4096 calls. Until now
bpf_seq_printf has just ignored this problem and just considered
everything as u64, I wonder if that'd be the best approach for these
two helpers anyway.