Le 13/09/2023 à 02:22, Puranjay Mohan a écrit : > On Wed, Sep 13, 2023 at 2:09 AM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: >> >> On Tue, Sep 12, 2023 at 3:49 PM Puranjay Mohan <puranjay12@xxxxxxxxx> wrote: >>> >>> Hi Alexei, >>> >>> [...] >>> >>>> I guess we never clearly defined what 'needs_zext' is supposed to be, >>>> so it wouldn't be fair to call 32-bit JITs buggy. >>>> But we better address this issue now. >>>> This 32-bit zeroing after LDX hurts mips64, s390, ppc64, riscv64. >>>> I believe all 4 JITs emit proper zero extension into 64-bit register >>>> by using single cpu instruction, >>>> but they also define bpf_jit_needs_zext() as true, >>>> so extra BPF_ZEXT_REG() is added by the verifier >>>> and it is a pure run-time overhead. >>> >>> I just realised that these zext instructions will not be a runtime >>> overhead because the JITs ignore them. >>> Like >>> s390 does: >>> case BPF_LDX | BPF_MEM | BPF_B: /* dst = *(u8 *)(ul) (src + off) */ >>> case BPF_LDX | BPF_PROBE_MEM | BPF_B: >>> /* llgc %dst,0(off,%src) */ >>> EMIT6_DISP_LH(0xe3000000, 0x0090, dst_reg, src_reg, REG_0, off); >>> jit->seen |= SEEN_MEM; >>> if (insn_is_zext(&insn[1])) >>> insn_count = 2; /* this will skip the next zext instruction */ >>> break; >>> >>> powerpc does after LDX: >>> if (size != BPF_DW && insn_is_zext(&insn[i + 1])) >>> addrs[++i] = ctx->idx * 4; >> >> >> I see. Indeed the 64-bit JITs ignore this special zext insn after LDX. >> >>>> It's better to remove >>>> if (t != SRC_OP) >>>> return BPF_SIZE(code) == BPF_DW; >>>> from is_reg64() to avoid adding BPF_ZEXT_REG() insn >>>> and fix 32-bit JITs at the same time. >>>> RISCV32, PowerPC32, x86-32 JITs fixed in the first 3 patches >>>> to always zero upper 32-bit after LDX and >>>> then 4th patch to remove these two lines. >>> >>> I have sent the patches for above, although I think this optimization >>> is useful because >>> zero extension after LDX is only required when the loaded value is >>> later being used as >>> a 64-bit value. If it is not the case then the verifier will not emit >>> the zext and 32-bit JITs will emit >>> 1 less instruction because they expect the verifier to do the zext for >>> them where required. >> >> You're correct. >> Ok. Let's keep zext for LDX as-is. > > Yes, > let's do > if (class == BPF_LDX) { > if (t != SRC_OP) > - return BPF_SIZE(code) == BPF_DW; > + return (BPF_SIZE(code) == BPF_DW || > BPF_MODE(code) == BPF_MEMSX); You don't need the parenthesis, just do return BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX; Christophe