Re: Why the Makefile is so eager to re-build & re-link

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 25, 2021 at 10:34:20AM +0200, Ævar Arnfjörð Bjarmason wrote:

> Interesting, but I think rather than micro-optimizing the O(n) loop it
> makes more sense to turn it into a series of O(1) in -j parallel,
> i.e. actually use the make dependency graph for this as I suggested in:
> https://lore.kernel.org/git/87wnqiyejg.fsf@xxxxxxxxxxxxxxxxxxx/

I have mixed feelings on that. I do like the general notion of breaking
apart tasks and feeding the dependencies to "make", because that lets it
do a better job of parallelizing or avoiding already-done work. But
there's a cost to running any job, so eventually you get to a unit of
work that's so small the overhead dominates.

For instance, starting from a built Git but dirtying all doc files with
"touch Documentation/*.txt", running "time make -j16" yields:

  real	0m1.749s
  user	0m2.963s
  sys	0m1.146s

With your patch to break it apart into many jobs, the same operation
gives:

  real	0m0.762s
  user	0m3.054s
  sys	0m0.600s

So that took fewer wall-clock seconds, but we actually spent more CPU.
On a system with fewer cores, it would probably be a loss in both
places.

Now maybe that's a good tradeoff, especially because the common case
(aside from a build-from-scratch, which will spend loads more time
actually compiling anyway) is that only a handful of files would be
updated.

But if we can just make the actual operation faster, then even O(n)
repeated work might be a win in all cases, because it's avoiding the
overhead of extra jobs.

I dunno. I think there's a formula here that depends on the overhead of
a job versus the time to handle a single file in the script, coupled
with the expected number of changed files for any run. I'm not sure of
the exact values of those numbers in this case, but am mostly pointing
out that it's a tradeoff and not always a pure win. :)

> Something like the hacky throwaway patch that follows. Now when you
> touch a file in Documentation/git-*.txt you re-make just that file
> chain, which gets assembled into the command-list.h:

I know you said this was throwaway, but in case you do pursue it
further, my first run hit:

  $ time make
  GIT_VERSION = 2.32.0.94.gaa5e6f14dd
      * new prefix flags
      GEN build/Documentation
      GEN build/Documentation/git-add.txt.cmdlist.in
  /bin/sh: 1: cannot create build/Documentation/git-add.txt.cmdlist.in: Directory nonexistent
  /bin/sh: 5: cannot create build/Documentation/git-add.txt.cmdlist.in: Directory nonexistent
      GEN build/Documentation/git-am.txt.cmdlist.in
  /bin/sh: 1: cannot create build/Documentation/git-am.txt.cmdlist.in: Directory nonexistent
  /bin/sh: 5: cannot create build/Documentation/git-am.txt.cmdlist.in: Directory nonexistent

So I'd guess there's some race with creating the build/Documentation
directory (a subsequent run worked fine).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux