On Wed, Dec 20, 2023 at 8:28 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > When I was writing the _eq/lt/... variants I noticed that unless I use > LHS to be a full 8-byte register the compiler can still play > shenanigans even with inline assembly, like choosing a different input > register for LHS than the one already allocated for a variable before > the assembly is emitted, doing left/right shifts to mask upper bits > before the inline assembly logic, and thus since the scalar id > association is broken on that, the cmp does not propagate to what are > logical copies. I saw that llvm can add a redundant r1 = w2 and use r1 as LHS in some cases, but I haven't seen extra shifts. When it's assignment like this then the register link is there and the verifier correctly propagates the range knowledge. Could you share an example where you saw shifts? I also don't understand the point of: _Static_assert(__builtin_constant_p((RHS)), "2nd argument must be a constant expression") in your macro. bpf_assert(bpf_cmp(foo, ==, bar)); seems useful too. Restricting RHS to a constant looks odd. > So maybe in addition to using bpf_cmp, we should instead do a > bpf_assert_op as a replacement that enforces this rule of LHS being 8 > byte (basically just perform __bpf_assert_check) before dispatching to > the bpf_cmp. I don't see the need to restrict LHS to 8 byte. I considered making bpf_cmp macro smarter and use "w" registers when sizeof(lhs)<=4, but I suspect it won't help the verifier much since 32-bit compares update lower 32-bit only and range of upper 32-bit may stay unknown (depending on what was in the code before 'if'). So I kept "r" only. I still need to investigate why barrier_var is better in profiler.c.