On Mon, Jul 08, 2019 at 03:49:33PM -0700, Alexei Starovoitov wrote: > > > Sorry for delay. I'm mostly offgrid until next week. > > > As far as -fno-gcse.. I don't mind as long as it doesn't hurt performance. > > > Which I suspect it will :( > > > All these indirect gotos are there for performance. > > > Single indirect goto and a bunch of jmp select_insn > > > are way slower, since there is only one instruction > > > for cpu branch predictor to work with. > > > When every insn is followed by "jmp *jumptable" > > > there is more room for cpu to speculate. > > > It's been long time, but when I wrote it the difference > > > between all indirect goto vs single indirect goto was almost 2x. > > > > Just to clarify, -fno-gcse doesn't get rid of any of the indirect jumps. > > It still has 166 indirect jumps. It just gets rid of the second > > optimization, where the jumptable address is placed in a register. > > what about other functions in core.c ? > May be it's easier to teach objtool to recognize that pattern? The GCC man page actually recommends using -fno-gcse for computed goto code, for better performance. So if that's actually true, then it would be win-win because objtool wouldn't need a change for it. Otherwise I can teach objtool to recognize the new pattern. > > If you have a benchmark which is relatively easy to use, I could try to > > run some tests. > > modprobe test_bpf > selftests/bpf/test_progs > both print runtime. > Some of test_progs have high run-to-run variations though. Thanks, I'll give it a shot. -- Josh
![]() |