Edward Cree writes: > On 17/06/2019 21:40, Jiong Wang wrote: >> Now if we don't split patch when patch an insn inside patch, instead, if we >> replace the patched insn using what you suggested, then the logic looks to >> me becomes even more complex, something like >> >> for (idx = 0; idx < insn_cnt; idx++) { >> if (insns[idx] is not BPF_LIST_INSN) { >> do_insn(...) >> } >> else if (insns[idx] is BPF_LIST_INSN) { >> list = pool_base + insn.imm; >> while (list) { >> insn = list_head->insn; >> if (insn is BF_LIST_INSN) { >> sub_list = ... >> while () >> do_insn() >> continue; >> } >> do_insn(...) >> list = pool_base + list->next; >> } >> } >> } > Why can't do_insn() just go like: > if (insn is BPF_LIST_INSN) > for (idx = 0; idx < LIST_COUNT(insn); idx++) > do_insn(pool_base + LIST_START(insn) + idx); > else > rest of processing > ? > > Alternatively, iterate with something more sophisticated than 'idx++' > (standard recursion-to-loop transformation). > You shouldn't ever need a for() tower statically in the code... I don't think this changes things much, the point is we still have two data structures for insns, array + list, so I fell you anyway need some tweak on existing traverse code while using singly linked list incurs very little changes, for example: for (i = 0; i < insn_cnt; i++, insn++) { => for (elem = list; elem; elem = elem->next) { insn = elem->insn; >> So, I am thinking what Alexei and Andrii suggested make sense, just use >> single data structure (singly linked list) to represent everything, so the >> insn traversal etc could be simple > But then you have to also store orig_insn_idx with each insn, so you can > calculate the new jump offsets when you linearise. Having an array of > patched_orig_insns gives you that for free. For pool based list, you don't need to store orig_insn_idx, those orig ones are guaranteed at the bottom of the pool, so just use index < orig_prog_len then you could know it is the orig insn. And for both pool and non-pool based list, the order of orig node in the list is the same as in array, so it quite easy to calculate the orig index as a by-product inside insn copy traverse, for non-pool base list, each node needs at least one bit to indicate it is orig node. I also found when patching a patched buffer which contains jmp insn is an issue (need to double check to see if there is such case), because then we will need to record the jump destination index of the jmp insn when it was inserted. And some updates on my side, did some benchmarking on both pool and non-pool based list. Patching time (ns) benchmarked using JIT blinding === existing non-pool pool "scale test 1" no stop ~0x1c600000 ~0x8800000 Bench(~3.4K insns) ~0xc50000 ~0xf1000 ~6b000 (The non-pool means kmalloc a list node for each patch snippet, pool means vmalloc a big chunk of mem and allocate node from it, node is located using pool_base + index) For "scale test 1" which contains ~1M JIT blindable insns, using list based infra for patching could reduce most of the patching time, and pool based alloc only consumes 1/3 time of non-pool. And for a normal program with reasonable size (~3.4K), pool based alloc only consumes 1/30 time of exsisting infra, and 1/2.3 of non-pool. On the other hand, non-pool based implementation is cleaner and less error prone than pool based implementation. And for both pool and non-pool, I got the following kernel warning when running "scale test 11" (perhaps needs 3M * 16 ~= 48M mem) [ 92.319792] WARNING: CPU: 6 PID: 2322 at mm/page_alloc.c:4639 __alloc_pages_nodemask+0x29e/0x330 I am finishing aux adjust and code delete support. Regards, Jiong