On Thu, Oct 21, 2021 at 02:48:24AM +0200, Ævar Arnfjörð Bjarmason wrote: > >> Per Eric's Sunshine's upthread comments an awk and Perl implementation > >> were both considered before[1]. > > > > Ah sorry, I thought it was just a perl one that had been the > > show-stopper. I hadn't noticed the awk one. However, the point of my > > patch was to use perl if available, and fall back otherwise. Maybe > > that's too ugly, but it does address the concern with Eric's > > implementation. > > I think carrying two implementations is worse than just having the one > slightly slower one. I have no opinion on whether or not assuming that awk or Perl exists and can be relied upon during the build is reasonable or not. It seems like the former might be a slightly safer assumption than the latter, but in all honesty it seems like both are always likely to be around. In any case, I think the point was that we could improve upon Peff's patch by just having a single implementation done in awk. And when I wrote that I definitely was in the mindset of being able to rely on awk during compilation. > >> I.e. I think if you e.g. touch Documentation/git-a*.txt with this series > >> with/without this awk version the difference in runtime is within the > >> error bars. I.e. making the loop faster isn't necessary. It's better to > >> get to a point where make can save you from doing all/most of the work > >> by checking modification times, rather than making an O(n) loop faster. > > > > FWIW, I don't agree with this paragraph at all. Parallelizing or reusing > > partial results is IMHO inferior to just making things faster. > > I agree with you in the general case, but for something that's consumed > by a make dependency graph I find it easier to debug things if > e.g. changing git-add.txt results in a change to git-add.gen, which is > then cat'd together. > > IOW if we had a sufficiently fast C compiler I think I'd still prefer > make's existing rules over some equivalent of: > > cat *.c | super-fast-cc > > Since similar to how the *.sp files depend on the the *.o files now, > declaring the dependency graph allows you to easily add more built > things. This seems like an unfair comparison to me. I might be more sympathetic if we were generating a more complicated artifact by running generate-cmdlist.sh, but its inputs and outputs seem very well defined (and non-complicated) to me. In any case, I agree with Peff that this isn't the approach that I would have taken. But I also think that *just* parallelizing isn't necessarily a win here. There are two reasons I think that: - The cognitive load required to parallelize this process is complicated; the .build directory seems like another thing to keep track of, and it's not clear to me what updates it, or what the result of touching some file in that directory is. - But even if the parallelization was achievable by more straightforward means, you still have to do the slow thing when you're rebuilding from scratch. So this is strictly worse the first time you are compiling, at least on machines with fewer cores. In any case, this is all overkill in my mind for what we are talking about. I agree that 'cat *.c | super-fast-cc' is worse than a competent Makefile that knows what to build and when. But the problem here is a slow loop in shell that is easily made much faster by implementing it in a language that can execute the whole loop in a single process. Thanks, Taylor