On Thu, May 09, 2019 at 01:32:30PM +0100, Jiong Wang wrote: > > Alexei Starovoitov writes: > > > On Wed, May 08, 2019 at 03:45:12PM +0100, Jiong Wang wrote: > >> > >> I might be misunderstanding your points, please just shout if I am wrong. > >> > >> Suppose the following BPF code: > >> > >> unsigned helper(unsigned long long, unsigned long long); > >> unsigned long long test(unsigned *a, unsigned int c) > >> { > >> unsigned int b = *a; > >> c += 10; > >> return helper(b, c); > >> } > >> > >> We get the following instruction sequence by latest llvm > >> (-O2 -mattr=+alu32 -mcpu=v3) > >> > >> test: > >> 1: w1 = *(u32 *)(r1 + 0) > >> 2: w2 += 10 > >> 3: call helper > >> 4: exit > >> > >> Argument Types > >> === > >> Now instruction 1 and 2 are sub-register defines, and instruction 3, the > >> call, use them implicitly. > >> > >> Without the introduction of the new ARG_CONST_SIZE32 and > >> ARG_CONST_SIZE32_OR_ZERO, we don't know what should be done with w1 and > >> w2, zero-extend them should be fine for all cases, but could resulting in a > >> few unnecessary zero-extension inserted. > > > > I don't think we're on the same page. > > The argument type is _const_. > > In the example above they are not _const_. > > Right, have read check_func_arg + check_helper_mem_access again. > > Looks like ARG_CONST_SIZE* are designed for describing memory access size > for things like bounds checking. It must be a constant for stack access, > otherwise prog will be rejected, but it looks to me variables are allowed > for pkt/map access. > > But pkt/map has extra range info. So, indeed, ARG_CONST_SIZE32* are > unnecessary, the width could be figured out through the range. > > Will just drop this patch in next version. > > And sorry for repeating it again, I am still concerned on the issue > described at https://www.spinics.net/lists/netdev/msg568678.html. > > To be simple, zext insertion is based on eBPF ISA and assumes all > sub-register defines from alu32 or narrow loads need it if the underlying It's not an assumption. It's a requirement. If JIT is not zeroing upper 32-bits after 32-bit alu or narrow load it's a bug. > hardware arches don't do it. However, some arches support hardware zext > partially. For example, PowerPC, SPARC etc are 64-bit arches, while they > don't do hardware zext on alu32, they do it for narrow loads. And RISCV is > even more special, some alu32 has hardware zext, some don't. > > At the moment we have single backend hook "bpf_jit_hardware_zext", once a > backend enable it, verifier just insert zero extension for all identified > alu32 and narrow loads. > > Given verifier analysis info is not pushed down to JIT back-ends, verifier > needs more back-end info pushed up from back-ends. Do you think make sense > to introduce another hook "bpf_jit_hardware_zext_narrow_load" to at least > prevent unnecessary zext inserted for narrowed loads for arches like > PowerPC, SPARC? > > The hooks to control verifier zext insertion then becomes two: > > bpf_jit_hardware_zext_alu32 > bpf_jit_hardware_zext_narrow_load and what to do with other combinations? Like in some cases narrow load on particular arch will be zero extended by hw and if it's misaligned or some other condition then it will not be? It doesn't feel that we can enumerate all such combinations. It feels 'bpf_jit_hardware_zext' backend hook isn't quite working. It optimizes out some zext, but may be adding unnecessary extra zexts. May be it should be a global flag from the verifier unidirectional to JITs that will say "the verifier inserted MOV32 where necessary. JIT doesn't need to do zext manually". And then JITs will remove MOV32 when hw does it automatically. Removal should be easy, since such insn will be right after corresponding alu32 or narrow load.