On Wed, Mar 27, 2019 at 05:06:01PM +0000, Jiong Wang wrote: > > > On 27 Mar 2019, at 17:00, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Tue, Mar 26, 2019 at 06:05:29PM +0000, Jiong Wang wrote: > >> After previous patches, verifier has marked those instructions that really > >> need zero extension on dst_reg. > >> > >> It is then for all back-ends to decide how to use such information to > >> eliminate unnecessary zero extension codegen during JIT compilation. > >> > >> One approach is: > >> 1. Verifier insert explicit zero extension for those instructions that > >> need zero extension. > >> 2. All JIT back-ends do NOT generate zero extension for sub-register > >> write any more. > >> > >> The good thing for this approach is no major change on JIT back-end > >> interface, all back-ends could get this optimization. > >> > >> However, only those back-ends that do not have hardware zero extension > >> want this optimization. For back-ends like x86_64 and AArch64, there is > >> hardware support, so this optimization should be disabled. > >> > >> This patch introduces new sysctl "bpf_jit_32bit_opt" which is the control > >> variable for whether the optimization should be enabled. > >> > >> It is initialized using target hook bpf_jit_hardware_zext which is default > >> true, meaning the underlying hardware will do zero extension automatically, > >> therefore the optimization will be disabled. > >> > >> Offload targets do not use this native target hook, instead, they could > >> get the optimization results using bpf_prog_offload_ops.finalize. > >> > >> The user could always enable or disable the optimization by using: > >> > >> sysctl net/core/bpf_jit_32bit_opt=[0 | 1] > > > > I don't think there should be a sysctl for this. > > The sysctl introduced mostly because I think it could be useful for testing. > For example on x86_64, with this sysctl, we can enable the optimisation and > can run selftest. > > Does this make sense? > > Or when one insn is marked, we print verbose info, so the tester could catch > it from log? sysctl in this patch only triggers insertion of shifts. what kind of testing does it enable on x64? The writing insn is already 32-bit and hw does zero extend. These two shifts is always a nop? a sysctl to test that the verifier inserted shifts in the right place?