On Thu, 2024-02-08 at 20:05 -0800, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > LLVM generates bpf_cast_kern and bpf_cast_user instructions while translating > pointers with __attribute__((address_space(1))). > > rX = cast_kern(rY) is processed by the verifier and converted to > normal 32-bit move: wX = wY > > bpf_cast_user has to be converted by JIT. > > rX = cast_user(rY) is > > aux_reg = upper_32_bits of arena->user_vm_start > aux_reg <<= 32 > wX = wY // clear upper 32 bits of dst register > if (wX) // if not zero add upper bits of user_vm_start > wX |= aux_reg > > JIT can do it more efficiently: > > mov dst_reg32, src_reg32 // 32-bit move > shl dst_reg, 32 > or dst_reg, user_vm_start > rol dst_reg, 32 > xor r11, r11 > test dst_reg32, dst_reg32 // check if lower 32-bit are zero > cmove r11, dst_reg // if so, set dst_reg to zero > // Intel swapped src/dst register encoding in CMOVcc > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> Checked generated x86 code for all reg combinations, works as expected. Acked-by: Eduard Zingerman <eddyz87@xxxxxxxxx>