Re: Compilation of lengthy C++ Files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 18 Oct 2023, 17:05 Kai Song via Gcc-help, <gcc-help@xxxxxxxxxxx>
wrote:

> Dear GCC Developers,
>
> I am unsuccessfully using g++ 12.0.4 to compile lengthy c++ codes. Those
> codes are automatically generated from my own code-generator tools that
> depend on parameters p.
> Typical applications are:
> - Taylor series of order p inserted into consistency conditions of
> numerical schemes, to determine optimal method parameters (think of, e.g.,
> Runge-Kutta methods)
> - recursive automatic code transformation (think of adjoints of adjoints of
> adjoints...) of recursion level p
> - Hilbert curves or other space-filling curves to generate code that
> simulates cache utilization in a Monte-Carlo context
>
> I verify that for small p the codes compile and execute to the expected
> result. However, there is always a threshold for p so that the generated
> cpp file is so long that the compiler will just terminate after ~10min
> without monitor output but return the value +1.
> My cpp files range from 600k LOC up to 1Bio LOC. Often, the file comprises
> of one single c++ template class member function definition that relies on
> a few thousand lines of template-classes.
>
> I would like to know:
> 1) Am I doing something wrong in that GCC should be able to compile lengthy
> codes?
>

Do you have enough RAM?

2) Is it known that GCC is unable to compile lengthy codes?
>

Yes, there are loads of bug reports about this kind of thing, some fixed,
some not.

3) Is it acknowledged that a compiler's ability to compile large files is
> relevant?
>

Yes, within reason. Some generated code is just silly and could be written
differently.

4) Are possible roots known for this inability and are these deliberate?
>

It depends on the code. Sometimes there are quadratic (or worse) algorithms
involved. Sometimes GCC's intermediate representation is just too memory
hungry.


> To give just one glimpse at the relevance: I analyzed methods for solving
> F(x)=0 by using k stages -DF(x)*vk= F(x+sum_i bki*vi).
> Compiling a Taylor series code with p=3, I was able to verify that three
> iterations of the Chord method converge of fourth order, thus surpassing
> Newton's method (same cost, but only quadratic order).
> Checking whether any four-stage method exists that converges of fifth order
> necessitates the compilation of a single cpp file function that is 1.2Mio
> lines of code long.


That's going to use insane amounts of memory just to represent the
function, before even trying to analyse it out optimise it.

Since 2018 I am occasionally trying to find a way to
> compile it.
> I also used this code to automatically generate Butcher-type tableaus for
> generalizations to initial-value problem solvers. One could really only
> dream of what scientific leaps become possible only if lengthy codes were
> compilable.
>
> Right now, whenever a situation of this kind occurs, I have to give up, cut
> my (possibly months of work) losses, and move on to working on something
> else -- because it seems there is nothing I can do about it.
>

Have you tried using clang instead?

Even spitting a code of 1Bio LOC into chunks of 100k LOC and then
> automating to compile and link these in turn seems hopeless; and probably
> the linker has limitations as well..
>

Very different ones though.

But since this problem is to frequent to my life, I would really like to
> find out how to fix it once and for all.
> This is a problem that I have been suffering from severely for years
> because it basically cripples how I can put my mathematical expertise to
> use.
>
> Thank you for getting in touch.
>
> Kind regards,
> Kai
>



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux