Re: Compilation of lengthy C++ Files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/10/2023 18:04, Kai Song via Gcc-help wrote:
Dear GCC Developers,

I am unsuccessfully using g++ 12.0.4 to compile lengthy c++ codes. Those
codes are automatically generated from my own code-generator tools that
depend on parameters p.
Typical applications are:
- Taylor series of order p inserted into consistency conditions of
numerical schemes, to determine optimal method parameters (think of, e.g.,
Runge-Kutta methods)
- recursive automatic code transformation (think of adjoints of adjoints of
adjoints...) of recursion level p
- Hilbert curves or other space-filling curves to generate code that
simulates cache utilization in a Monte-Carlo context

I verify that for small p the codes compile and execute to the expected
result. However, there is always a threshold for p so that the generated
cpp file is so long that the compiler will just terminate after ~10min
without monitor output but return the value +1.
My cpp files range from 600k LOC up to 1Bio LOC. Often, the file comprises
of one single c++ template class member function definition that relies on
a few thousand lines of template-classes.

I would like to know:
1) Am I doing something wrong in that GCC should be able to compile lengthy
codes?
2) Is it known that GCC is unable to compile lengthy codes?
3) Is it acknowledged that a compiler's ability to compile large files is
relevant?
4) Are possible roots known for this inability and are these deliberate?


I am curious to know why you are generating code like this. I can see how some code generators for mathematical code could easily produce large amounts of code, but it is rarely ideal for real-world uses. Such flattened code can reduce overheads and improve optimisation opportunities (like inlining, constant folding, function cloning, etc.) for small cases, but then they get impractical for compiling while the costs for cache misses outweigh the overhead cost for the loops or recursion needed for general solutions.

Any compiler is going to be targeted and tuned towards "normal" or "typical" code. That means primarily hand-written code, or smaller generated code. I know that some systems generate very large functions or large files, but those are primarily C code, and the code is often very simple and "flat". (Applications here include compilers that use C as a intermediary target language, and simulators of various kinds.) It typically makes sense to disable certain optimisation passes here, and a number of passes scale badly (quadratic or perhaps worse) with function size.

However, if you are generating huge templates in C++, you are going a big step beyond that - templates are, in a sense, code generators themselves that run at compile time as an interpreted meta-language. I don't expect that there has been a deliberate decision to limit GCC's handling of larger files, but I can't imagine that huge templates are a major focus for the compiler development. And I would expect enormous memory use and compile times even when it does work.





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux