Re: Paralizing configure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Marian Marinov wrote on Wed, Feb 09, 2011 at 01:38:16AM CET:
> On Tuesday 08 February 2011 22:51:20 Paul Eggert wrote:
> > Oh yes, I quite agree, it would require a real change to
> > the Autoconf implementation, and people who write tests
> > would have to be disciplined about their dependencies.
> > The default would be sequential, for backward compatibility,
> > but if someone goes to the work to declaring their dependencies,
> > we could assume that it could be slotted into a make -j
> > or whatever.

> We can start a new branch and see if it is worth the work. And if it
> is ok we can then start the upgrade of all the code.
> 
> I'm sure that there would be situations in which the tests must remain
> sequental. However we can isolate them into bigger independent
> sections of tests.

A preliminary step would be to identify macros that could be
parallelized easily and which would benefit from it: several
of the AC_PROG_* macros might be run in parallel, but some of
them are typically so fast that just forking a subshell for
them would be a waste of time.

The generic AC_{CHECK,PATH}_PROG can be parallelized if no
special arguments are passed (e.g., variables that denote
results from other such tests).

The compile and link machinery would need to run in thread-local
subdirectories or with thread-local file names at least, in order
to allow any of the compile or link tests to run in parallel.
(The latter is problematic if the compiler doesn't grok -c -o.)
This is probably one of the hardest things to get right, and veery
backward-incompatible-prone.  Say, CPPFLAGS have relative -I paths.

The special AC_FUNC_* could then mostly run in parallel, except of
course any macros pulled in via AC_REQUIRE would need to be
outside of the parallel part.

The generic AC_CHECK_FUNC* could run in parallel if none of the
arguments referenced results from earlier macros.  I don't think
it's worth going this way though, because the AC_CHECK_FUNCS_ONCE
could be used instead: they can safely be run in parallel.

Generic header file checks are problematic, as already discussed.
AC_CHECK_HEADERS_ONCE can come to the rescue.  The special header
checks AC_HEADER_* could carry information on their semantics.

Variable and function declaration checks are problematic, except
maybe for AC_CHECK_DECLS_ONCE.

Type checks could be a good source of parallelism, if the macro
arguments are simple.  _ONCE macros could be introduced.


All macros that can run in parallel would need to somehow denote
this.  I'm not sure if the AC_REQUIRE notation can be reused as
"this needs to run before that".  But anyway it is not a good idea
if threads themselves spawn other threads; only the master should
do that.

An efficient(!) way to transport results back needs to be implemented.
Probably results could be written in a shell script snippet to be
sourced by the master after the test has finished.  This means the
driver would be notified which shell variable some test sets as result.
Errors during a test would need to be transported back, including status
and messages.  For special ACTION-IF-{TRUE,FALSE} arguments, or
AC_DEFINE's or so, they would have to be done in the master, not the
thread.

It is not easy to get this correct while also still actually providing
much speedup at all.  The forking overhead can be significant (you could
measure Autotest for data).

As a first idea, there could be an
  AC_RUNS_PARALLEL([OUTPUT-VARIABLE...])

denotation specifying that some macro it is called in has been written
to run in parallel.  Then,
  AC_PARALLEL([MACRO1],
              [MACRO2([ARGS...])],
              ...])

would run a block in parallel if *all* macros are AC_RUNS_PARALLEL ones,
otherwise in serial.

Still, this doesn't achieve parallelism between bigger blocks.
There could be an
  AC_PARALLEL_UNSAFE([BLOCK1],
                     [BLOCK2],...)

that runs larger blocks in parallel, in the hope that all relevant
results from all macros inside each block are appended to the per-thread
results file, and that the blocks truly do not have interdependencies.

Upsides of such an approach would be that can be done incrementally (it
would be straightforward to produce a non-parallel configure script from
it as long as the machinery isn't stable yet).

But I have no clue if or how the parallel-or-not decision can be
implemented at M4 time, nor time to work on it in the near future.

End of brain dump.  Thanks for reading this far.  Kudos go to Bruno for
some discussions about this last year.

Cheers,
Ralf

_______________________________________________
Autoconf mailing list
Autoconf@xxxxxxx
http://lists.gnu.org/mailman/listinfo/autoconf


[Index of Archives]     [GCC Help]     [Kernel Discussion]     [RPM Discussion]     [Red Hat Development]     [Yosemite News]     [Linux USB]     [Samba]

  Powered by Linux