On Fri, May 01, 2020 at 12:40:53PM -0700, Alexei Starovoitov wrote: > On Fri, May 01, 2020 at 02:22:04PM -0500, Josh Poimboeuf wrote: > > On Fri, May 01, 2020 at 12:09:30PM -0700, Alexei Starovoitov wrote: > > > On Thu, Apr 30, 2020 at 02:07:43PM -0500, Josh Poimboeuf wrote: > > > > Objtool decodes instructions and follows all potential code branches > > > > within a function. But it's not an emulator, so it doesn't track > > > > register values. For that reason, it usually can't follow > > > > intra-function indirect branches, unless they're using a jump table > > > > which follows a certain format (e.g., GCC switch statement jump tables). > > > > > > > > In most cases, the generated code for the BPF jump table looks a lot > > > > like a GCC jump table, so objtool can follow it. However, with > > > > RETPOLINE=n, GCC keeps the jump table address in a register, and then > > > > does 160+ indirect jumps with it. When objtool encounters the indirect > > > > jumps, it can't tell which jump table is being used (or even whether > > > > they might be sibling calls instead). > > > > > > > > This was fixed before by disabling an optimization in ___bpf_prog_run(), > > > > using the "optimize" function attribute. However, that attribute is bad > > > > news. It doesn't append options to the command-line arguments. Instead > > > > it starts from a blank slate. And according to recent GCC documentation > > > > it's not recommended for production use. So revert the previous fix: > > > > > > > > 3193c0836f20 ("bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()") > > > > > > > > With that reverted, solve the original problem in a different way by > > > > getting rid of the "goto select_insn" indirection, and instead just goto > > > > the jump table directly. This simplifies the code a bit and helps GCC > > > > generate saner code for the jump table branches, at least in the > > > > RETPOLINE=n case. > > > > > > > > But, in the RETPOLINE=y case, this simpler code actually causes GCC to > > > > generate far worse code, ballooning the function text size by +40%. So > > > > leave that code the way it was. In fact Alexei prefers to leave *all* > > > > the code the way it was, except where needed by objtool. So even > > > > non-x86 RETPOLINE=n code will continue to have "goto select_insn". > > > > > > > > This stuff is crazy voodoo, and far from ideal. But it works for now. > > > > Eventually, there's a plan to create a compiler plugin for annotating > > > > jump tables. That will make this a lot less fragile. > > > > > > I don't like this commit log. > > > Here you're saying that the code recognized by objtool is sane and good > > > whereas well optimized gcc code is somehow voodoo and bad. > > > That is just wrong. > > > > I have no idea what you're talking about. > > > > Are you saying that ballooning the function text size by 40% is well > > optimized GCC code? It seems like a bug to me. That's the only place I > > said anything bad about GCC code. > > It could be a bug, but did you benchmark the speed of interpreter ? > Is it faster or slower with 40% more code ? > Did you benchmark it on other archs ? I thought we were in agreement that 40% text growth is bad. Isn't that why you wanted to keep 'goto select_insn' for the retpoline case? If there's some other reason, let me know and I'll put it in the patch description instead. > > When I said "this stuff is crazy voodoo" I was referring to the patch > > itself. I agree it's horrible, it's only the best approach we're able > > to come up with at the moment. > > please reword it then. Ok, so: This *patch* is crazy voodoo ? -- Josh