On Fri, Jun 14, 2019 at 10:06 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Fri, Jun 14, 2019 at 8:13 AM Jiong Wang <jiong.wang@xxxxxxxxxxxxx> wrote: > > > > > > Alexei Starovoitov writes: > > > > > On Wed, Jun 12, 2019 at 8:25 AM Jiong Wang <jiong.wang@xxxxxxxxxxxxx> wrote: > > >> > > >> > > >> Jiong Wang writes: > > >> > > >> > Alexei Starovoitov writes: > > >> > > > >> >> On Wed, Jun 12, 2019 at 4:32 AM Naveen N. Rao > > >> >> <naveen.n.rao@xxxxxxxxxxxxxxxxxx> wrote: > > >> >>> > > >> >>> Currently, for constant blinding, we re-allocate the bpf program to > > >> >>> account for its new size and adjust all branches to accommodate the > > >> >>> same, for each BPF instruction that needs constant blinding. This is > > >> >>> inefficient and can lead to soft lockup with sufficiently large > > >> >>> programs, such as the new verifier scalability test (ld_dw: xor > > >> >>> semi-random 64 bit imms, test 5 -- with net.core.bpf_jit_harden=2) > > >> >> > > >> >> Slowdown you see is due to patch_insn right? > > >> >> In such case I prefer to fix the scaling issue of patch_insn instead. > > >> >> This specific fix for blinding only is not addressing the core of the problem. > > >> >> Jiong, > > >> >> how is the progress on fixing patch_insn? > > >> > > >> And what I have done is I have digested your conversion with Edward, and is > > >> slightly incline to the BB based approach as it also exposes the inserted > > >> insn to later pass in a natural way, then was trying to find a way that > > >> could create BB info in little extra code based on current verifier code, > > >> for example as a side effect of check_subprogs which is doing two insn > > >> traversal already. (I had some such code before in the historical > > >> wip/bpf-loop-detection branch, but feel it might be still too heavy for > > >> just improving insn patching) > > > > > > BB - basic block? > > > I'm not sure that was necessary. > > > The idea was that patching is adding stuff to linked list instead > > > and single pass at the end to linearize it. > > > > Just an update and keep people posted. > > > > Working on linked list based approach, the implementation looks like the > > following, mostly a combine of discussions happened and Naveen's patch, > > please feel free to comment. > > > > - Use the reserved opcode 0xf0 with BPF_ALU as new pseudo insn code > > BPF_LIST_INSN. (0xf0 is also used with BPF_JMP class for tail call). > > > > - Introduce patch pool into bpf_prog->aux to keep all patched insns. > > Pool structure looks like: > > > > struct { > > int num; > > int prev; > > int next; > > } head_0; > > NUM patched insns for head_0 > > head_1; > > patched insns for head_1 > > head_2; > > ... > > > > - Now when doing bpf_patch_insn_single, it doesn't change the original > > prog etc, instead, it merely update the insn at patched offset into a > > BPF_LIST_INSN, and pushed the patched insns plus a patch header into > > the patch pool. Fields of BPF_LIST_INSN is updated to setup the links: > > > > BPF_LIST_INSN.off += patched_size > > (accumulating the size attached to this list_insn, it is possible a > > later patch pass patches insn in the patch pool, this means insn > > traversal needs to be changed, when seeing BPF_LIST_INSN, should go > > through the list) > > > > BPF_LIST_INSN.imm = offset of the patch header in patch pool > > (off is 16-bit, imm is 32-bit, the patch pool is 32-bit length, so > > use imm for keeping offset, meaning a BPF_LIST_INSN can contains no > > more than 8192 insns, guess it is enough) > > > > - When doing linearize: > > 1. a quick scan of prog->insnsi to know the final > > image size, would be simple as: > > > > fini_size = 0; > > for_each_insn: > > if (insn.code == (BPF_ALU | BPF_LIST_HEAD)) > > fini_size += insn->off; > > else > > fini_size++; > > > > 2. Resize prog into fini_size, and a second scan of prog->insnsi to > > copy over all insns and patched insns, at the same time generate a > > auxiliary index array which maps an old index to the new index in > > final image, like the "clone_index" in Naveen's patch. > > > > 3. Finally, a single pass to update branch target, the same algo used > > by this patch. > > > > - The APIs for doing insning patch looks like: > > bpf_patch_insn_init: init the generic patch pool. > > bpf_patch_insn_single: push patched insns to the pool. > > link them to the associated BPF_LIST_INSN. > > bpf_patch_insn_fini: linearize a bpf_prog contains BPF_LIST_INSN. > > destroy patch pool in prog->aux. > > > > I am trying to making the implementation working with jit blind first to make > > sure basic things are ready. As JIT blinds happens after verification so no > > need to both aux update etc. Then will cleanup quite a few things for > > example patch a patched insn, adjust aux data, what to do with insn delete > > etc. > > explicit indices feels like premature optimization. > May be use vanilla singly linked list instead? > Also do we have a case when patched insn will be patched again? > In such case 'patch insn pool' will become recursive? > Feels difficult to think through all offsets and indices. > Whereas with linked list patching patched insns will be inserting > them into link list. > > May be better alternative is to convert the whole program to link list > of insns with branch targets becoming pointers and insert patched > insns into this single singly linked list ? I think converting to a singly-linked list is a good, simple and straightforward approach. The only downside is 3x more memory (insn + next pointer + branch pointer for instructions that branch/call), but it shouldn't be a big deal in practice. But it seems like a good opportunity to simplify and clean up patching code, so I'd definitely vote to start with this approach.