On Thu, 2024-04-11 at 20:27 +0800, Xu Kuohai wrote: > From: Xu Kuohai <xukuohai@xxxxxxxxxx> > > With lsm return value check, the no-alu32 version test_libbpf_get_fd_by_id_opts > is rejected by the verifier, and the log says: > > 0: R1=ctx() R10=fp0 > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 0: (b7) r0 = 0 ; R0_w=0 > 1: (79) r2 = *(u64 *)(r1 +0) > func 'bpf_lsm_bpf_map' arg0 has btf_id 916 type STRUCT 'bpf_map' > 2: R1=ctx() R2_w=trusted_ptr_bpf_map() > ; if (map != (struct bpf_map *)&data_input) @ test_libbpf_get_fd_by_id_opts.c:29 > 2: (18) r3 = 0xffff9742c0951a00 ; R3_w=map_ptr(map=data_input,ks=4,vs=4) > 4: (5d) if r2 != r3 goto pc+4 ; R2_w=trusted_ptr_bpf_map() R3_w=map_ptr(map=data_input,ks=4,vs=4) > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 5: (79) r0 = *(u64 *)(r1 +8) ; R0_w=scalar() R1=ctx() > ; if (fmode & FMODE_WRITE) @ test_libbpf_get_fd_by_id_opts.c:32 > 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) > 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) > ; @ test_libbpf_get_fd_by_id_opts.c:0 > 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) > ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 > 9: (95) exit > > And here is the C code of the prog. > > SEC("lsm/bpf_map") > int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) > { > if (map != (struct bpf_map *)&data_input) > return 0; > > if (fmode & FMODE_WRITE) > return -EACCES; > > return 0; > } > > It is clear that the prog can only return either 0 or -EACCESS, and both > values are legal. > > So why is it rejected by the verifier? > > The verifier log shows that the second if and return value setting > statements in the prog is optimized to bitwise operations "r0 s>>= 63" > and "r0 &= -13". The verifier correctly deduces that the the value of > r0 is in the range [-1, 0] after verifing instruction "r0 s>>= 63". > But when the verifier proceeds to verify instruction "r0 &= -13", it > fails to deduce the correct value range of r0. > > 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) > 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) > > So why the verifier fails to deduce the result of 'r0 &= -13'? > > The verifier uses tnum to track values, and the two ranges "[-1, 0]" and > "[0, -1ULL]" are encoded to the same tnum. When verifing instruction > "r0 &= -13", the verifier erroneously deduces the result from > "[0, -1ULL] AND -13", which is out of the expected return range > [-4095, 0]. > > To fix it, this patch simply adds a special SCALAR32 case for the > verifier. That is, when the source operand of the AND instruction is > a constant and the destination operand changes from negative to > non-negative and falls in range [-256, 256], deduce the result range > by enumerating all possible AND results. > > Signed-off-by: Xu Kuohai <xukuohai@xxxxxxxxxx> > --- Hello, Sorry for the delay, I had to think about this issue a bit. I found the clang transformation that generates the pattern this patch tries to handle. It is located in DAGCombiner::SimplifySelectCC() method (see [1]). The transformation happens as a part of DAG to DAG rewrites (LLVM uses several internal representations: - generic optimizer uses LLVM IR, most of the work is done using this representation; - before instruction selection IR is converted to Selection DAG, some optimizations are applied at this stage, all such optimizations are a set of pattern replacements; - Selection DAG is converted to machine code, some optimizations are applied at the machine code level). Full pattern is described as follows: // fold (select_cc seteq (and x, y), 0, 0, A) -> (and (sra (shl x)) A) // where y is has a single bit set. // A plaintext description would be, we can turn the SELECT_CC into an AND // when the condition can be materialized as an all-ones register. Any // single bit-test can be materialized as an all-ones register with // shift-left and shift-right-arith. For this particular test case the DAG is converted as follows: .---------------- lhs The meaning of this select_cc is: | .------- rhs `lhs == rhs ? true value : false value` | | .----- true value | | | .-- false value v v v v (select_cc seteq (and X 2) 0 0 -13) ^ -> '---------------. (and (sra (sll X 62) 63) | -13) | | Before pattern is applied, it checks that second 'and' operand has only one bit set, (which is true for '2'). The pattern itself generates logical shift left / arithmetic shift right pair, that ensures that result is either all ones (-1) or all zeros (0). Hence, applying 'and' to shifts result and false value generates a correct result. In my opinion the approach taken by this patch is sub-optimal: - 512 iterations is too much; - this does not cover all code that could be generated by the above mentioned LLVM transformation (e.g. second 'and' operand could be 1 << 16). Instead, I suggest to make a special case for source or dst register of '&=' operation being in range [-1,0]. Meaning that one of the '&=' operands is either: - all ones, in which case the counterpart is the result of the operation; - all zeros, in which case zero is the result of the operation; - derive MIN and MAX values based on above two observations. [1] https://github.com/llvm/llvm-project/blob/4523a267829c807f3fc8fab8e5e9613985a51565/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L5391 Best regards, Eduard