On Tue, Jul 16, 2024 at 11:15 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Tue, 2024-07-16 at 03:03 -0700, Eduard Zingerman wrote: > > On Mon, 2024-07-15 at 19:00 -0700, Alexei Starovoitov wrote: > > > On Mon, Jul 15, 2024 at 4:02 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > [...] > > > > > > This might lead to a surprising behavior in combination with nocsr > > > > rewrites, e.g. consider the program below: > > > > > > > > 1: r1 = 1; > > > > /* nocsr pattern with stack offset -16 */ > > > > 2: *(u64 *)(r10 - 16) = r1; > > > > 3: call %[bpf_get_smp_processor_id]; > > > > 4: r1 = *(u64 *)(r10 - 16); > > > > 5: r1 = r10; > > > > 6: r1 += -8; > > > > 7: r2 = 1; > > > > 8: r3 = r10; > > > > 9: r3 += -16; > > > > /* bpf_probe_read_kernel(dst: &fp[-8], size: 1, src: &fp[-16]) */ > > > > 10: call %[bpf_probe_read_kernel]; > > > > 11: exit; > > > > > > > > Here nocsr rewrite logic would remove instructions (2) and (4). > > > > However, (2) writes a value that is later read by a call at (10). > > > > > > This makes no sense to me. > > > This bpf prog is broken. > > > If probe_read is used to read stack it will read garbage. > > > JITs and the verifier are allowed to do any transformation > > > that keeps the program semantics and safety. > > Ok, my bad, the following program works at the moment: > > SEC("socket") // <---- used wrong program type > __retval(42) > __success > int bpf_probe_read_kernel_stack_ptr(void *ctx) > { > unsigned long a = 17; > unsigned long b = 42; > int err; > > err = bpf_probe_read_kernel(&a, 8, &b); > if (err) > return 1; > return a; > } > > And it is compiled to BPF as one would expect: > > ... fp[-8,-16] setup ... > 4: r1 = r10 > 5: r1 += -0x8 > 6: r3 = r10 > 7: r3 += -0x10 > 8: w2 = 0x8 > 9: call 0x71 > ... return check ... > > So, the point stands: from C compiler pov pointer &b escapes, > and compiler is not really allowed to replace object at that offset > with garbage. Why do you think the program is broken? This is apples to oranges. Compiler sees that the address of 'b' is taken and passed into a function with side effect. Whether 3rd arg of bpf_probe_read_kernel() is void * or long is irrelevant. Compilers will do it, because it's a C language requirement. > I don't mind dropping the patch in question, but I agree with Andrii's > viewpoint that there is nothing wrong with this use case. bpf_probe_read_kernel() is not special and it's 3rd argument is some kernel address. Whether it's stack pointer or anything else is irrelevant. JITs and verifier are allowed to do any optimizations on stack and any other memory completely ignoring presence of bpf_probe_read_kernel() and what is being passed into it. Tomorrow we will teach arm64 JIT to replace stack spill/fill with spare register read/write. There is no way we're going to special case a particular fp-16 slot because fp-16 was passed into probe_read.