On Thu, 2024-11-07 at 09:50 -0800, Eduard Zingerman wrote: > Consider dead code elimination problem for program like below: > > main: > 1: r1 = 42 > 2: call <subprogram>; > 3: exit > > subprogram: > 4: r0 = 1 > 5: if r1 != 42 goto +1 > 6: r0 = 2 > 7: exit; > > Here verifier would visit every instruction and thus > bpf_insn_aux_data->seen flag would be set for both true (7) > and falltrhough (6) branches of conditional (5). > Hence opt_hard_wire_dead_code_branches() will not replace > conditional (5) with unconditional jump. [...] Had an off-list discussion with Alexei yesterday, here are some answers to questions raised: - The patches #1,2 with opt_hard_wire_dead_code_branches() changes are not necessary for dynptr_slice kfunc inlining / branch removal. I will drop these patches and adjust test cases. - Did some measurements for dynptr_slice call using simple benchmark from patch #11: - baseline: 76.167 ± 0.030M/s million calls per second; - with call inlining, but without branch pruning (only patch #3): 101.198 ± 0.101M/s million calls per second; - with call inlining and with branch pruning (full patch-set): 116.935 ± 0.142M/s million calls per second.