----- On Sep 24, 2021, at 3:55 PM, Segher Boessenkool segher@xxxxxxxxxxxxxxxxxxx wrote: > Hi! > > On Fri, Sep 24, 2021 at 02:38:58PM -0400, Mathieu Desnoyers wrote: >> Following the LPC2021 BoF about control dependency, I re-read the kernel >> documentation about control dependency, and ended up thinking that what >> we have now is utterly fragile. >> >> Considering that the goal here is to prevent the compiler from being able to >> optimize a conditional branch into something which lacks the control >> dependency, while letting the compiler choose the best conditional >> branch in each case, how about the following approach ? >> >> #define ctrl_dep_eval(x) ({ BUILD_BUG_ON(__builtin_constant_p((_Bool) >> x)); x; }) >> #define ctrl_dep_emit_loop(x) ({ __label__ l_dummy; l_dummy: asm volatile goto >> ("" : : : "cc", "memory" : l_dummy); (x); }) >> #define ctrl_dep_if(x) if ((ctrl_dep_eval(x) && ctrl_dep_emit_loop(1)) >> || ctrl_dep_emit_loop(0)) > > [The "cc" clobber only pessimises things: the asm doesn't actually > clobber the default condition code register (which is what "cc" means), > and you can have conditional branches using other condition code > registers, or on other registers even (general purpose registers is > common.] I'm currently considering removing both "memory" and "cc" clobbers from the asm goto. > >> The idea is to forbid the compiler from considering the two branches as >> identical by adding a dummy loop in each branch with an empty asm goto. >> Considering that the compiler should not assume anything about the >> contents of the asm goto (it's been designed so the generated assembly >> can be modified at runtime), then the compiler can hardly know whether >> each branch will trigger an infinite loop or not, which should prevent >> unwanted optimisations. > > The compiler looks if the code is identical, nothing more, nothing less. > There are no extra guarantees. In principle the compiler could see both > copies are empty asms looping to self, and so consider them equal. I would expect the compiler not to attempt combining asm goto based on their similarity because it has been made clear starting from the original requirements from the kernel community to the gcc developers that one major use-case of asm goto involves self-modifying code (patching between nops and jumps). If this happens to be a real possibility, then we may need to work-around this for other uses of asm goto as well. If there is indeed a scenario where the compiler can combine similar asm goto statements, then I suspect we may want to emit unique dummy code in the assembly which gets placed in a discarded section, e.g.: #define ctrl_dep_emit_loop(x) ({ __label__ l_dummy; l_dummy: asm goto ( \ ".pushsection .discard.ctrl_dep\n\t" \ ".long " __stringify(__COUNTER__) "\n\t" \ ".popsection\n\t" \ "" : : : : l_dummy); (x); }) But then a similar trick would be needed for jump labels as well. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com