On Thu, Jun 24 2021, Jeff King wrote: > On Thu, Jun 24, 2021 at 03:16:48PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> This is probably all stuff that's been on list-before / known by >> some/all people in the CC list, but in case not: I looked a bit into why >> we'e so frequently re-linking and re compiling things these days, >> slowing down e.g. "git rebase --exec='make ...'". >> >> These are all fixable issues, I haven't worked on them, just some notes >> in case anyone has better ideas: > > From a quick skim I didn't see anything wrong in your analysis or > suggestions. I do kind of wonder if we are hitting a point of > diminishing returns here. "make -j16" on my system takes ~175ms for a > noop, and ~650ms if I have to regenerate version.h (it's something like > 2s total of CPU, but I have 8 cores). > > I know I've probably got a nicer machine than many other folks. But at > some point correctness and avoiding complexity in the Makefile become a > win over shaving off a second from compile times. You'd probably find > lower hanging fruit in the test suite which could shave off tens of > seconds. :) It's mainly annoying when e.g. doing a rebase of an N patch series, those ~700ms v.s. ~200ms add up quickly. >> * {command,config}-list.h (and in-flight, my hook-list.h): Every time >> you touch a Documentation/git-*.txt we need to re-generate these, and >> since their mtime changes we re-compile and re-link all the way up to >> libgit and our other tools. >> >> I think the best solution here is to make the generate-*.sh >> shellscripts faster (just one takes ~300ms of nested shellscripting, >> just to grep out the first few lines of every git-*.txt, in e.g. Perl >> or a smarter awk script this would be <5ms). > > Yeah, I think Eric mentioned he had looked into doing this in perl, but > we weren't entirely happy with the dependency. Here's another really odd > thing I noticed: > > $ time sh ./generate-cmdlist.sh command-list.txt >one > real 0m1.323s > user 0m1.531s > sys 0m0.834s > > $ time sh -x ./generate-cmdlist.sh command-list.txt >two > [a bunch of trace output] > real 0m0.513s > user 0m0.754s > sys 0m0.168s > > $ cmp one two > [no output] > > Er, what? Running with "-x" makes it almost 3 times faster to generate > the same output? I'd have said this is an anomaly, but it's repeatable > (and swapping the order produces the same output, so it's not some weird > priming thing). And then to top it all off, redirecting the trace is > slow again: > > $ time sh -x ./generate-cmdlist.sh command-list.txt >two 2>/dev/null > real 0m1.363s > user 0m1.538s > sys 0m0.902s > > A little mini-mystery that I think I may leave unsolved for now. Sounds interesting if true, I haven't looked into it. >> Then we make those FORCE, but most of the time the config or command >> summary (or list of hooks) doesn't change, so we don't need to >> replace the file. > > Yes, possibly we could use the "if it hasn't changed, don't update the > file" trick to avoid cascading updates. The problem is also that you can only do it at the lowest level, or you'll get into a dead-end of something else depending on the FORCE target continually re-making it, even though the target itself decided there was nothing to do based on a cmp(1). >> Perhaps even better would be to piggy-back on the RUNTIME_PREFIX >> support, and simply drop in generated plain-text files, so in your build >> checkout the list of hooks, commands etc. would be parsed instead of >> compiled in. Then we wouldn't need to re-build or re-link anything for >> the version or this other data. > > Yeah, that would work. I worry a bit that the value of something like > "version.h" is lost with a runtime file, though. The point is to bake it > into the binary so you can't accidentally get the wrong value (say, from > running "./git" from the build directory, which looks at the runtime > file where the binary _would_ be installed, except you haven't run "make > install" yet). I think all of those concerns are covered under RUNTIME_PREFIX, it discovers files relative to git whether you have it installed or not. I still haven't looked into why I sometimes need --exec-path=$PWD in the build checkout, and sometimes not though...