On 1/23/20 7:06 PM, John Fastabend wrote: > Yonghong Song wrote: >> Commit b7a0d65d80a0 ("bpf, testing: Workaround a verifier failure >> for test_progs") worked around a verifier failure where the >> register is copied to another later refined register, but the >> original register is used after refinement. Another similar example is >> https://lore.kernel.org/netdev/871019a0-71f8-c26d-0ae8-c7fd8c8867fc@xxxxxx/ >> >> LLVM commit https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D72787&d=DwICaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=MoQP7VnsPjDQ8_NbLoNfOhFXJ0BWpjIq29RBOqKCVO0&s=a9JM4PoQiQlvFWOgb7Jn2y-dv1D78nK5gaFe4uLOgDU&e= added a phase > > FWIW I was going to try and see if we could refine the bounds by > walking parents chain. But that is an experiment tbd. My current implementation is limited to use cases I see. So please do ahead to experiment with your use cases if you want to, or let me know I can implement for you. > >> to adjust optimization such that the original register is >> directly refined and used later. Another issue exposed by >> the llvm is verifier cannot handle the following code: >> call bpf_strtoul >> if w0 s< 1 then ... >> if w0 s> 7 then ... >> ... use w0 ... >> >> Unfortunately, the verifier is not able to handle the above >> code well and will reject it. >> call bpf_strtoul >> R0_w=inv(id=0) R8=invP0 >> if w0 s< 0x1 goto pc-22 >> R0_w=inv(id=0) R8=invP0 >> if w0 s> 0x7 goto pc-23 >> R0=inv(id=0) R8=invP0 >> w0 += w8 >> R0_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=invP0 >> >> After "w0 += w8", we got a very conservative R0 value, which >> later on caused verifier rejection. >> >> This patch added two register states, s32_min_value and s32_max_value, >> to bpf_reg_state. These two states capture the signed 32bit >> min/max values refined due to 32bit signed sle/slt/sge/sgt comparisons. >> 1. whenever refined s32_min_value, s32_max_value is captured, reg->var_off >> will be refined if possible. >> 2. For any ALU32 operation where the dst_reg will have upper 32bit cleared, >> if s32_min_value >= 0 and s32_max_value has been narrowed due to previous >> signed compare operation, the dst_reg as an input can ignore upper 32bit values, >> this may produce better output dst_reg value range. > > Can you comment a bit more on the s32_min_value < 0 case? Regardless of the > s32_{min|max}_value the result should be zero extended and smin_value=0. This > is enforced by verifier_zext(), an aside but I think we should just remove > verifier_zext its not very useful if all jits have to comply imo. I am not aware of verifier_zext, need to take a look. The reason I required s32_min_value >= 0 is in coerce_reg_to_size() eventually we have reg->smin_value = reg->umin_value, so smin_value >= 0 here. I am not fully understand why we did this way and thought requiring s32_min_value >= 0 will at least making the eventually smin_value/smax_value correctly. If you can help explain coerce_reg_to_size(), especially why we do reg->smin_value = reg->umin_value; > > If smin_value=0 && s32_max_value>=0 then we should be safe to propagate s32_max_value > into smax_Value as well. > >> 3. s32_min_value and s32_max_value is reset if the corresponding register >> is redefined. >> >> The following shows the new register states for the above example. >> call bpf_strtoul >> R0_w=inv(id=0) R8=invP0 >> if w0 s< 0x1 goto pc-22 >> R0_w=inv(id=0,smax_value=9223372034707292159,umax_value=18446744071562067967, >> s32_min_value=1,var_off=(0x0; 0xffffffff7fffffff)) >> R8=invP0 >> if w0 s> 0x7 goto pc-23 >> R0=inv(id=0,smax_value=9223372032559808519,umax_value=18446744069414584327, >> s32_min_value=1,s32_max_value=7,var_off=(0x0; 0xffffffff00000007)) >> R8=invP0 >> w0 += w8 >> R0_w=inv(id=0,umax_value=7,var_off=(0x0; 0x7)) R8=invP0 > > And we should also have smin_value=0? Right, I need to check smin_value update in function adjust_scalar_min_max_vals(). > >> >> With the above LLVM patch and this commit, the original >> workaround in Commit b7a0d65d80a0 is not needed any more. >> >> Signed-off-by: Yonghong Song <yhs@xxxxxx> >> --- >> include/linux/bpf_verifier.h | 2 + >> kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++----- >> 2 files changed, 65 insertions(+), 10 deletions(-) >> >> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h >> index 5406e6e96585..d5694308466d 100644 >> --- a/include/linux/bpf_verifier.h >> +++ b/include/linux/bpf_verifier.h >> @@ -123,6 +123,8 @@ struct bpf_reg_state { >> s64 smax_value; /* maximum possible (s64)value */ >> u64 umin_value; /* minimum possible (u64)value */ >> u64 umax_value; /* maximum possible (u64)value */ >> + s32 s32_min_value; /* minimum possible (s32)value */ >> + s32 s32_max_value; /* maximum possible (s32)value */ >> /* parentage chain for liveness checking */ >> struct bpf_reg_state *parent; >> /* Inside the callee two registers can be both PTR_TO_STACK like >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index 1cc945daa9c8..c5d6835c38db 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -543,6 +543,14 @@ static void print_verifier_state(struct bpf_verifier_env *env, >> if (reg->umax_value != U64_MAX) >> verbose(env, ",umax_value=%llu", >> (unsigned long long)reg->umax_value); >> + if (reg->s32_min_value != reg->umin_value && >> + reg->s32_min_value != S32_MIN) >> + verbose(env, ",s32_min_value=%d", >> + (int)reg->s32_min_value); >> + if (reg->s32_max_value != reg->umax_value && >> + reg->s32_max_value != S32_MAX) >> + verbose(env, ",s32_max_value=%d", >> + (int)reg->s32_max_value); >> if (!tnum_is_unknown(reg->var_off)) { >> char tn_buf[48]; >> >> @@ -923,6 +931,10 @@ static void __mark_reg_known(struct bpf_reg_state *reg, u64 imm) >> reg->smax_value = (s64)imm; >> reg->umin_value = imm; >> reg->umax_value = imm; >> + >> + /* no need to be precise, just reset s32_{min,max}_value */ >> + reg->s32_min_value = S32_MIN; >> + reg->s32_max_value = S32_MAX; > > If its known it would make more sense to me to set the min/max value vs this > shortcut. Otherwise we wont have bounds to compare against on a JMP32. With JMP32, we will do a coerce_reg_to_size(). For a single constant, we should be okay as long as the value is in 32bit range. But for consistent reasons, I can make s32_{min|max}_value more precise here. > >> } > > Still thinking through the rest, but figured it would be worth kicking the > couple comments above out.. I'm trying to understand if the coerce_reg_size() > can be made a bit cleaner. Thanks for the review! The logic is pretty complex. Indeed coerce_reg_to_size() is one place I did not fully understand how it works. > > Thanks, > John >