On Wed, 18 Oct 2023, 17:05 Kai Song via Gcc-help, <gcc-help@xxxxxxxxxxx> wrote: > Dear GCC Developers, > > I am unsuccessfully using g++ 12.0.4 to compile lengthy c++ codes. Those > codes are automatically generated from my own code-generator tools that > depend on parameters p. > Typical applications are: > - Taylor series of order p inserted into consistency conditions of > numerical schemes, to determine optimal method parameters (think of, e.g., > Runge-Kutta methods) > - recursive automatic code transformation (think of adjoints of adjoints of > adjoints...) of recursion level p > - Hilbert curves or other space-filling curves to generate code that > simulates cache utilization in a Monte-Carlo context > > I verify that for small p the codes compile and execute to the expected > result. However, there is always a threshold for p so that the generated > cpp file is so long that the compiler will just terminate after ~10min > without monitor output but return the value +1. > My cpp files range from 600k LOC up to 1Bio LOC. Often, the file comprises > of one single c++ template class member function definition that relies on > a few thousand lines of template-classes. > > I would like to know: > 1) Am I doing something wrong in that GCC should be able to compile lengthy > codes? > Do you have enough RAM? 2) Is it known that GCC is unable to compile lengthy codes? > Yes, there are loads of bug reports about this kind of thing, some fixed, some not. 3) Is it acknowledged that a compiler's ability to compile large files is > relevant? > Yes, within reason. Some generated code is just silly and could be written differently. 4) Are possible roots known for this inability and are these deliberate? > It depends on the code. Sometimes there are quadratic (or worse) algorithms involved. Sometimes GCC's intermediate representation is just too memory hungry. > To give just one glimpse at the relevance: I analyzed methods for solving > F(x)=0 by using k stages -DF(x)*vk= F(x+sum_i bki*vi). > Compiling a Taylor series code with p=3, I was able to verify that three > iterations of the Chord method converge of fourth order, thus surpassing > Newton's method (same cost, but only quadratic order). > Checking whether any four-stage method exists that converges of fifth order > necessitates the compilation of a single cpp file function that is 1.2Mio > lines of code long. That's going to use insane amounts of memory just to represent the function, before even trying to analyse it out optimise it. Since 2018 I am occasionally trying to find a way to > compile it. > I also used this code to automatically generate Butcher-type tableaus for > generalizations to initial-value problem solvers. One could really only > dream of what scientific leaps become possible only if lengthy codes were > compilable. > > Right now, whenever a situation of this kind occurs, I have to give up, cut > my (possibly months of work) losses, and move on to working on something > else -- because it seems there is nothing I can do about it. > Have you tried using clang instead? Even spitting a code of 1Bio LOC into chunks of 100k LOC and then > automating to compile and link these in turn seems hopeless; and probably > the linker has limitations as well.. > Very different ones though. But since this problem is to frequent to my life, I would really like to > find out how to fix it once and for all. > This is a problem that I have been suffering from severely for years > because it basically cripples how I can put my mathematical expertise to > use. > > Thank you for getting in touch. > > Kind regards, > Kai >