* ☂Josh Chia (謝任中) via Gcc-help: > I have a code snippet that I'm wondering why GCC didn't optimize the way I > think it should: > https://godbolt.org/z/1qKvax > > bar2() is a variant of bar1() that has been manually tweaked to avoid > branches. I haven't done any benchmarks but, I would expect the branchless > bar2() to perform better than bar1() but GCC does not automatically > optimize bar1() to be like bar2(); the generated code for bar1() and bar2() > are different and the generated code for bar1() contains a branch. The optimization is probably valid for C99, but not for C11, where the memory model prevents the compiler from introducing spurious writes: Another thread may modify the variable concurrently, and if this happens only if foo returns NULL, the original bar1 function does not contain a data race, but the branchless version would. Thanks, Florian -- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill