On Mon, 2024-12-02 at 09:32 +0100, Kumar Kartikeya Dwivedi wrote: > Hello, > For the following program, > > 0: R1=ctx() R10=fp0 > ; asm volatile (" \ @ > verifier_spill_fill.c:19 > 0: (b7) r1 = 1024 ; R1_w=1024 > 1: (63) *(u32 *)(r10 -12) = r1 ; R1_w=1024 R10=fp0 fp-16=mmmm???? > 2: (61) r1 = *(u32 *)(r10 -12) ; > R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > R10=fp0 fp-16=mmmm???? > 3: (95) exit > R0 !read_ok > processed 4 insns (limit 1000000) max_states_per_insn 0 total_states 0 > peak_states 0 mark_read 0 > > This is a reduced test case from a real world sched-ext scheduler when > a 32-byte array was maintained on the stack to store some values, > whose values were then used in bounds checking. A known constant was > stored in the array and later refilled into a reg to perform a bounds > check, similar to the example above. > > Like in the example, the verifier loses precision for the value r1, > i.e. when it is loaded back from the 4-byte aligned stack slot, the > precise value is lost. > For the actual program, this meant that bounds check produced an > error, as after the fill of the u32 from the u32[N] array, the > verifier didn't see the exact value. > > I understand why the verifier has to behave this way, since each > spilled bpf_reg_state maps to one stack slot, and the stack slot maps > to an 8-byte region. > My question is whether this is something that people are interested in > improving longer term, or is it better to suggest people to workaround > such cases? I'd start by trying to measure how much precision we leave on the table. E.g. modify fixed offset stack read/write routines to count the number of aligned u8/u16/u32 reads/writes, expose this information in verifier statistics, aggregate it in varistat, and then run on the selftests. Of-course, the test cases are tuned to be verifiable, so the signal would be biased, but still, u32 is pretty common in the source tree. As Alexei says in the sibling thread, there are multiple ways to integrate u32 spill/fill tracking, e.g.: - treat bpf_stack_state->spilled_ptr as a union of 64-bit register or two 32-bit registers; - treat u32 writes as 64-bit writes that don't change tnum representation of the other spilled_ptr half (but invalidate the range information). The former seems to be a cleaner approach, but would need more work (and might benefit from [0] to save some space). The latter seems easier to implement but might frustrate end user by having different tracking power for u32 and u64 values. Do you have some specific implementation in mind? [0] https://lore.kernel.org/bpf/ZTZxoDJJbX9mrQ9w@u94a/