Hi, The binary gotten via clang's `-O1` runs much slower (3x) than that gotten via GCC's `-O1`. Reproducible with: https://github.com/m-chaturvedi/test_valgrind_slowdown We are seeing this difference between gcc and clang at other places as well. The `-O0` and `-O2` times are comparable, however. Are there some compile time flags one could add to make the `-O1` times comparable? I asked this question at LLVM, but they don't seem too interested, so I was wondering if someone could help me here. http://lists.llvm.org/pipermail/llvm-dev/2018-May/123708.html Thank you, Mmanu