[RFC bpf-next 0/8] bpf: accelerate insn patching speed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is an RFC based on latest bpf-next about acclerating insn patching
speed, it is now near the shape of final PATCH set, and we could see the
changes migrating to list patching would brings, so send out for
comments. Most of the info are in cover letter. I splitted the code in a
way to show API migration more easily.

Test Results
===
  - Full pass on test_verifier/test_prog/test_prog_32 under all three
    modes (interpreter, JIT, JIT with blinding).

  - Benchmarking shows 10 ~ 15x faster on medium sized prog, and reduce
    patching time from 5100s (nearly one and a half hour) to less than
    0.5s for 1M insn patching.

Known Issues
===
  - The following warning is triggered when running scale test which
    contains 1M insns and patching:
      warning of mm/page_alloc.c:4639 __alloc_pages_nodemask+0x29e/0x330

    This is caused by existing code, it can be reproduced on bpf-next
    master with jit blinding enabled, then run scale unit test, it will
    shown up after half an hour. After this set, patching is very fast, so
    it shows up quickly.

  - No line info adjustment support when doing insn delete, subprog adj
    is with bug when doing insn delete as well. Generally, removal of insns
    could possibly cause remove of entire line or subprog, therefore
    entries of prog->aux->linfo or env->subprog needs to be deleted. I
    don't have good idea and clean code for integrating this into the
    linearization code at the moment, will do more experimenting,
    appreciate ideas and suggestions on this.
     
    Insn delete doesn't happen on normal programs, for example Cilium
    benchmarks, and happens rarely on test_progs, so the test coverage is
    not good. That's also why this RFC have a full pass on selftest with
    this known issue.

  - Could further use mem pool to accelerate the speed, changes are trivial
    on top of this RFC, and could be 2x extra faster. Not included in this
    RFC as reducing the algo complexity from quadratic to linear of insn
    number is the first step.

Background
===
This RFC aims to accelerate BPF insn patching speed, patching means expand
one bpf insn at any offset inside bpf prog into a set of new insns, or
remove insns.

At the moment, insn patching is quadratic of insn number, this is due to
branch targets of jump insns needs to be adjusted, and the algo used is:

  for insn inside prog
    patch insn + regeneate bpf prog
    for insn inside new prog
      adjust jump target

This is causing significant time spending when a bpf prog requires large
amount of patching on different insns. Benchmarking shows it could take
more than half minutes to finish patching when patching number is more
than 50K, and the time spent could be more than one hour when patching
number is around 1M.

  15000   :    3s
  45000   :   29s
  95000   :  125s
  195000  :  712s
  1000000 : 5100s

This RFC introduces new patching infrastructure. Before doing insn
patching, insns in bpf prog are turned into a singly linked list, insert
new insns just insert new list node, delete insns just set delete flag.
And finally, the list is linearized back into array, and branch target
adjustment is done for all jump insns during linearization. This algo
brings the time complexity from quadratic to linear of insn number.

Benchmarking shows the new patching infrastructure could be 10 ~ 15x faster
on medium sized prog, and for a 1M patching it reduce the time from 5100s
to less than 0.5s.

Patching API
===
Insn patching could happen on two layers inside BPF. One is "core layer"
where only BPF insns are patched. The other is "verification layer" where
insns have corresponding aux info as well high level subprog info, so
insn patching means aux info needs to be patched as well, and subprog info
needs to be adjusted. BPF prog also has debug info associated, so line info
should always be updated after insn patching.

So, list creation, destroy, insert, delete is the same for both layer,
but lineration is different. "verification layer" patching require extra
work. Therefore the patch APIs are:

   list creation:                bpf_create_list_insn
   list patch:                   bpf_patch_list_insn
   list pre-patch:               bpf_prepatch_list_insn
   list lineration (core layer): prog = bpf_linearize_list_insn(prog, list)
   list lineration (veri layer): env = verifier_linearize_list_insn(env, list)
   list destroy:                 bpf_destroy_list_insn

list patch could change the insn at patch point, it will invalid the aux
info at patching point. list pre-patch insert new insns before patch point
where the insn and associated aux info are not touched, it is used for
example in convert_ctx_access when generating prologue. 

Typical API sequence for one patching pass:

   struct bpf_list_insn list = bpf_create_list_insn(struct bpf_prog);
   for (elem = list; elem; elem = elem->next)
      patch_buf = gen_patch_buf_logic;
      elem = bpf_patch_list_insn(elem, patch_buf, cnt);
   bpf_prog = bpf_linearize_list_insn(list)
   bpf_destroy_list_insn(list)
  
Several patching passes could also share the same list:

   struct bpf_list_insn list = bpf_create_list_insn(struct bpf_prog);
   for (elem = list; elem; elem = elem->next)
      patch_buf = gen_patch_buf_logic1;
      elem = bpf_patch_list_insn(elem, patch_buf, cnt);
   for (elem = list; elem; elem = elem->next)
      patch_buf = gen_patch_buf_logic2;
      elem = bpf_patch_list_insn(elem, patch_buf, cnt);
   bpf_prog = bpf_linearize_list_insn(list)
   bpf_destroy_list_insn(list)

but note new inserted insns int early passes won't have aux info except
zext info. So, if one patch pass requires all aux info updated and
recalculated for all insns including those pathced, it should first
linearize the old list, then re-create the list. The RFC always create and
linearize the list for each migrated patching pass separately.

Compared with old patching code, this new infrastructure has much less core
code, even though the final code has a couple of extra lines but that is
mostly due to for list based infrastructure, we need to do more error
checks, so the list and associated aux data structure could be freed when
errors happens.

Patching Restrictions
===
  - For core layer, the linearization assume no new jumps inside patch buf.
    Currently, the only user of this layer is jit blinding.
  - For verifier layer, there could be new jumps inside patch buf, but
    they should have branch target resolved themselves, meaning new jumps
    doesn't jump to insns out of the patch buf. This is the case for all
    existing verifier layer users.
  - bpf_insn_aux_data for all patched insns including the one at patch
    point are invalidated, only 32-bit zext info will be recalcuated.
    If the aux data of insn at patch point needs to be retained, it is
    purely insn insertion, so need to use the pre-patch API.

I plan to send out a PATCH set once I finished insn deletion line info adj
support, please have a looks at this RFC, and appreciate feedbacks.

Jiong Wang (8):
  bpf: introducing list based insn patching infra to core layer
  bpf: extend list based insn patching infra to verification layer
  bpf: migrate jit blinding to list patching infra
  bpf: migrate convert_ctx_accesses to list patching infra
  bpf: migrate fixup_bpf_calls to list patching infra
  bpf: migrate zero extension opt to list patching infra
  bpf: migrate insn remove to list patching infra
  bpf: delete all those code around old insn patching infrastructure

 include/linux/bpf_verifier.h |   1 -
 include/linux/filter.h       |  27 +-
 kernel/bpf/core.c            | 431 +++++++++++++++++-----------
 kernel/bpf/verifier.c        | 649 +++++++++++++++++++------------------------
 4 files changed, 580 insertions(+), 528 deletions(-)

-- 
2.7.4




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux