Re: Improve precision loss when doing <8-bytes spill to stack slot?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 2, 2024 at 12:32 AM Kumar Kartikeya Dwivedi
<memxor@xxxxxxxxx> wrote:
>
> Hello,
> For the following program,
>
> 0: R1=ctx() R10=fp0
> ; asm volatile ("                                       \ @
> verifier_spill_fill.c:19
> 0: (b7) r1 = 1024                     ; R1_w=1024
> 1: (63) *(u32 *)(r10 -12) = r1        ; R1_w=1024 R10=fp0 fp-16=mmmm????
> 2: (61) r1 = *(u32 *)(r10 -12)        ;
> R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
> R10=fp0 fp-16=mmmm????
> 3: (95) exit
> R0 !read_ok
> processed 4 insns (limit 1000000) max_states_per_insn 0 total_states 0
> peak_states 0 mark_read 0
>
> This is a reduced test case from a real world sched-ext scheduler when
> a 32-byte array was maintained on the stack to store some values,
> whose values were then used in bounds checking. A known constant was
> stored in the array and later refilled into a reg to perform a bounds
> check, similar to the example above.
>
> Like in the example, the verifier loses precision for the value r1,
> i.e. when it is loaded back from the 4-byte aligned stack slot, the
> precise value is lost.
> For the actual program, this meant that bounds check produced an
> error, as after the fill of the u32 from the u32[N] array, the
> verifier didn't see the exact value.
>
> I understand why the verifier has to behave this way, since each
> spilled bpf_reg_state maps to one stack slot, and the stack slot maps
> to an 8-byte region.
> My question is whether this is something that people are interested in
> improving longer term, or is it better to suggest people to workaround
> such cases?

I think we need to consider improving the verifier if we can
come up with a reasonable path to implement it.

The stack_state memory consumption can be mitigated as:
 struct bpf_stack_state {
        struct bpf_reg_state spilled_ptr;
+       struct bpf_reg_state *spilled_ptr_off4;
        u8 slot_type[BPF_REG_SIZE];
 };

but I suspect plenty of code would need to be touched to support
such spill/fill.

And then the next question is whether support for u16 is needed too.

Without seeing an actual code it's hard to judge whether full bpf_reg_state
is needed with tnum and ranges.
It may be the case that only constants are needed to be tracked.
Then we can approach it from the angle of generalizing STACK_ZERO.
Like replace STACK_ZERO with STACK_CONST_VAL and let every byte
remember the actual value written.
 0: (b7) r1 = 1024                     ; R1_w=1024
 1: (63) *(u32 *)(r10 -12) = r1        ; R1_w=1024 R10=fp0 fp-16=mmmm????

will populate four constants 00 00 04 00

and *(u32 *)(r10 -12) will read it back as 1024.

Pretty much what mark_reg_stack_read() is doing today
with a special case of zero, but for all u8 consts.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux