Toolchain bootstrapping advice needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greetings GCC users and developers!

I've recently embarked upon a possibly futile effort to create a script to bootstrap a GNU toolchain - binutils, gcc, and glibc - from a system with the most minimal of prerequisites. My goal is to have a script that can, on any build host, create a toolchain that can run on a specific host type (which could be the same as the build system type or could be different), and targeting a possibly different target type. In other words, I'm trying to build a single script that can build a toolchain for any arbitrary combination of build, host, and target system type.

I've gotten pretty far, although it's taken quite a long time to understand the intricate dance that must be performed to bootstrap gcc and glibc, and also has required some patches to glibc, mostly to get past what I consider to be a deficiency in its autoconf scripts: namely, that they error out on tests that check for a working linker when in fact glibc ought to be buildable without any liker at all (although its utility programs can't be built, but those aren't necessary during the bootstrapping process). At this point I can produce a working binutils and glibc, but the "final" build of gcc is giving me some problems that I am still working through.

But what I really want to talk about is my general approach and to get a validation of it and my assumptions about the value of (or lack thereof) my approach.

Basically, I want my build script to not assume that the system compiler is anything other than an ISO C90 compiler, with a standard C library that may not have anything to do with glibc but is a complete implementation of the C library standard.

What this forces me to do is to not simply compile all the tools with the system compiler directly; the only tools I can build with the system compiler are binutils and gcc. glibc itself has a strict requirement that it be compiled with gcc and I don't even want to assume that any old version of gcc on the system is sufficient; I'd rather let the version of gcc being bootstrapped be the one to compile glibc. In this way it's up to the toolchain builder to choose versions of binutils, gcc, and glibc that are known to work together, not to require that his or her build system has a binutils, gcc, or glibc that is compatible with whatever target versions are being built. Like I said, I just want to assume an ISO C90 compiler and C library, and nothing more.

The set of steps that I have come up with to accomplish this bootstrapping is:

1. Build binutils
2. Build stage1 gcc, building just the "gcc" and "install-gcc" targets, not the full build (which would try to compile libraries that require glibc, which has not yet been built) 3. Build stage1 glibc using the stage1 gcc compiler; this uses the binutils from (1) and the stage1 gcc from (2). This version of glibc is built with only static libraries and without any of the helper programs of glibc, because the stage1 gcc cannot build shared libraries or executables. 4. Build stage2 gcc against the stage1 glibc, with executable and shared library support, but without libmudflap which cannot be built against the purely static stage1 glibc. 5. Build final glibc with stage2 gcc, this is a complete and final glibc with shared library support and support of all features. 6. Build final gcc against final glibc, which is a complete gcc with full support for all features.

(my remaining difficulty is with step 6, the problem being that the stage2 gcc uses a sysroot that is causing it to fail to be able to link against final glibc properly, but I'll work that out)

These steps are complicated by gcc's library dependencies (zlib, gmp, mpfr, mpc) that must be built for both the build system, host system, and target system at various points during the process, and also by multiple versions of binutils needing to be built because of binutils "feature" of requiring sysroot to be a compile-time option instead of a runtime option.

What the above sequence produces is a cross-compiler built to run on the build system targeting a given target system, which is not the end goal of the bootstrapping process, but does produce cross-compilers that are needed to complete the process.

That sequence is run twice: once to produce a cross-compiler that runs on the build system and targets the host system, and once to produce a cross-compiler that runs on the build system and targets the target system (if host=target, then only one build is necessary).

Finally, once a cross-compiler for both the host and target system is available, a final binutils version to run on the host system and target the target system is built, along with a gcc for the host system targeting the target system.

These steps result in quite a few compiles:

- binutils is built 9 times
- gcc is built 6 times
- glibc is built 5 times

But I believe that this process is successful in not depending on the version of the build system compiler at all; it simply needs to be ISO 90 compliant so that it can build gcc and binutils (like I mentioned, the bootstrapped gcc and binutils are themselves used to create glibc). At each step of the bootstrapping process, each tool is only dependent on the other tools being built, except of course for the build system ISO C90 compiler and C library.

One shortcoming of my approach is that the final version of glibc is not built by a gcc that was built by itself; it is instead built by a gcc that was built by the build system compiler. Does this matter? If so I think the easiest thing for me to do would be to adapt my script to first build binutils, gcc, and glibc with build=host=target, and then use that as the "build system toolchain" for the other steps I outlined above. Then the versions of the compiler and binutils that will be used to produce the final versions of glibc will have been built by the target toolchain itself instead of by the system toolchain. This will add 4 more binutils builds, 3 more gcc builds, and 2 more glibc builds to the mix, but it will hopefully produce even more robust output.

I think that some of my steps could be simplified if I could convince myself that I don't need to use sysroots during various stages of the bootstrapping, and can just reference the build system includes and libraries instead of trying to always be sure that every step references only the toolchain being built. Is it a worthwhile goal to try to make every build step rely only on the toolchain being built instead of the build system toolchain?

Finally, can someone validate my assumptions here:

1. When gcc is built, it should be built with a --with-build-sysroot that references the version of glibc being built rather than the build system's libc.

2. When glibc is built, it is OK for it to reference the build system's libc header files rather than its own. I haven't figured out how to configure glibc's build to reference only its own headers instead of the system libc headers (I try to avoid CFLAGS because it wreaks havoc with configure).

3. --with-build-sysroot is a sufficient option to cause gcc builds to only reference glibc headers and libs produced during the bootstrapping process

Sorry for the disjoint and wordy nature of this post; I'm really tired after many long hours of hacking on this script.

Thanks!
Bryan



[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux