Re: Parallelization of shell scripts for 'configure' etc.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6/14/22 16:36, Richard Purdie wrote:
> On Tue, 2022-06-14 at 13:11 -0400, Nick Bowler wrote:
>> The resulting config.h is correct but pa.sh took almost 1 minute to run
>> the configure script, about ten times longer than dash takes to run the
>> same script.  More than half of that time appears to be spent just
>> loading the program into pa.sh, before a single shell command is
>> actually executed.
> 
> Thanks for sharing that, it saves me looking into it!
> 
> I work on a cross compiling build environment (Yocto Project) and we
> find that a large percentage of our build times (20%?) are in the
> configure stage, either running autoreconf or configure with a 50/50
> split between the two. We autoreconf since we change the macros in some
> cases, e.g. libtool.
> 
> I would love to find a way to be more efficient about this part of our
> builds. We do already provide some cached values for some macros to try
> and be a little more efficient.
> 
> When I've profiled things, most of the time seems to be "fork" overhead
> of builds having to fork new processes to run shell command pipelines.
> I have sometimes wondered if we couldn't make code which was more
> optimised to the common case and didn't have so much forking going on.

I wonder if one could implement a shell that only created a new
process when it absolutely had to, and which implemented many of the
common text processing tools as builtin commands.  Subshells would
be implemented via user-level copy-on-write, rather than relying on
OS support for fork().

Another approach would be to generate Python or Perl scripts
in addition to shell scripts, allowing the use of the respective
interpreters when available.  In my experience that is basically all
the time.

Finally, a small but probably noticable improvement would come
from dropping support for ancient platforms, such as Ultrix.  A much
bigger win would be to use Bash or Zsh if they are installed, as that
allows using modern shell tricks (such as [[ "$a" =~ [0-9]+ ]] and
"${a//a/b}") that do not require forking new processes.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

Attachment: OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


[Index of Archives]     [GCC Help]     [Kernel Discussion]     [RPM Discussion]     [Red Hat Development]     [Yosemite News]     [Linux USB]     [Samba]

  Powered by Linux