> On 27 Mar 2019, at 17:17, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Wed, Mar 27, 2019 at 05:06:01PM +0000, Jiong Wang wrote: >> >>> On 27 Mar 2019, at 17:00, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: >>> >>> On Tue, Mar 26, 2019 at 06:05:29PM +0000, Jiong Wang wrote: >>>> After previous patches, verifier has marked those instructions that really >>>> need zero extension on dst_reg. >>>> >>>> It is then for all back-ends to decide how to use such information to >>>> eliminate unnecessary zero extension codegen during JIT compilation. >>>> >>>> One approach is: >>>> 1. Verifier insert explicit zero extension for those instructions that >>>> need zero extension. >>>> 2. All JIT back-ends do NOT generate zero extension for sub-register >>>> write any more. >>>> >>>> The good thing for this approach is no major change on JIT back-end >>>> interface, all back-ends could get this optimization. >>>> >>>> However, only those back-ends that do not have hardware zero extension >>>> want this optimization. For back-ends like x86_64 and AArch64, there is >>>> hardware support, so this optimization should be disabled. >>>> >>>> This patch introduces new sysctl "bpf_jit_32bit_opt" which is the control >>>> variable for whether the optimization should be enabled. >>>> >>>> It is initialized using target hook bpf_jit_hardware_zext which is default >>>> true, meaning the underlying hardware will do zero extension automatically, >>>> therefore the optimization will be disabled. >>>> >>>> Offload targets do not use this native target hook, instead, they could >>>> get the optimization results using bpf_prog_offload_ops.finalize. >>>> >>>> The user could always enable or disable the optimization by using: >>>> >>>> sysctl net/core/bpf_jit_32bit_opt=[0 | 1] >>> >>> I don't think there should be a sysctl for this. >> >> The sysctl introduced mostly because I think it could be useful for testing. >> For example on x86_64, with this sysctl, we can enable the optimisation and >> can run selftest. >> >> Does this make sense? >> >> Or when one insn is marked, we print verbose info, so the tester could catch >> it from log? > > sysctl in this patch only triggers insertion of shifts. > what kind of testing does it enable on x64? > The writing insn is already 32-bit and hw does zero extend. > These two shifts is always a nop? > a sysctl to test that the verifier inserted shifts in the right place? Yes, that’s the test methodology I am using. Match the instruction sequence after shifts insertion. Regards, Jiong