On Thu, Mar 4, 2021 at 5:55 AM David Laight <David.Laight@xxxxxxxxxx> wrote: > > > (a) the traditional include guard optimization HAS NO HIDDEN SEMANTIC > > MEANING. It's a pure optimization that doesn't actually change > > anything else. If you don't do the optimization, absolutely nothing > > changes. > > And if the parser is well written the optimisation is probably > irrelevant compared to the compile time. That's actually surprisingly not even remotely true. People always think that the optimization phases of a compiler are the expensive ones. And yes, there are certain optimizations that can be *really* expensive, and people just don't even do them because they are _so_ expensive and are exponential in time. And yes, there can be some patterns that expose bad scaling in some compiler algorithms, and histyorically that has often been seen when people use generators to automatically generate huge functions (or huge initializers), and then the compiler has some O(n**3) thing in it that is entirely unnoticeable for normal code written by a human, but means that a million-entry initializer takes five hours to compile. But in reality, on very real normal code, the *front end* of the compiler is often the most costly thing by far. Very much just the lexer and the simple parser. Things that are "simple" and people think are fast. But they are are often the slowest part in C style languages, because you have to lex and parse _everything_. You include maybe a dozen header files, maybe more. Those in turn often include another dozen support header files. You easily end up tokenizing and parsing hundreds of header files for even the simplest programs - and 99,.9% of that isn't ever *used*, and never actually generates any code that needs to be optimized. Think of all the complex macros and functions and type definitions you get when you include something basic like <stdio.h>. Trust me, it's a _lot_ of small details that get tokenixed, parsed and memoized. And then all you do is a 'printf("Hello world");' and the amount of actual code generation and optimization by the back-end is basically zero. And C++ is about an order of magnitude worse, because you really take this whole approach and turn it up to 11 with templates, namespaces, STL, and a lot of all that infrastructure that is there for all the possible cases, but that most files that include it only use a small tiny portion of. So in C-style (and particularly C++) languages, reading header files, tokenizing them and parsing them can _easily_ be the bulk of your compile time. Absolutely every single serious C/C++ compiler integrates tbe preprocessor _into_ the compiler, because the traditional model of having a separate "cpp" phase actually made the tokenization problem happen _twice_: once by the pre-processor, and then a second time by the compiler. Integrating the two phases, so that you can use just one tokenizer, and one single set of identifiers, actually speeds up compile speed _enormously_. Yes, yes, tokenization and parsing really is the simple part of a compiler. Good register allocation is _hard_. SSA makes a lot of the basic optimizations actually fairly straightforward and simple, but more complex transformations (loop invariant stuff, vectorization, various things like that) are really much much much more complex than the simple front-end. But that simple front end is the thing that touches absolutely _everything_, whether it actually generates code or not. And yes, this is mainly a C-style language problem. If you have hardcoded modules approach and not the "include files that describe all the different interfaces", you don't end up in that situation where you spend a lot of time parsign the possible interfaces over and over again. And C++ really is hugely worse, and takes this issue to another level. You can have simple single C++ files that end up basically having to parse hundreds of thousands of lines of fairly complex code, because they use several complex libraries, and that's how the library header files are set up,m with multiple levels (ie the GUI header files depend on, and include the lower-level object header files, which in turn depend on "simple" core libraries like STL and Boost. This is why pretty much every compiler out there does that include file header guard optimization. Because avoiding having to read and parse a file multiple times really does show up as a big big win. Even when it's all hidden behind the #ifndef/#define/#endif guarding logic, the cost of reading the file and lexing and parsing it enough to _see_ those guards is not cheap. Doing it once is required and important, but then you memoize the fact that "oh, all the stuff I needed to parse was behind this macro test, so next time I see this file, if that macro is set, I can just ignore it". So it is actually a very important optimization. It's just that the optimization is better done with that explicit guard memoization than it is with '#pragma once' Linus