On Tue, Nov 01, 2022 at 11:35:50PM +0100, Ævar Arnfjörð Bjarmason wrote: > Optimize the very slow "coccicheck" target to take advantage of > incremental rebuilding, and fix outstanding dependency problems with > the existing rule. > > The rule is now faster both on the initial run as we can make better > use of GNU make's parallelism than the old ad-hoc combination of > make's parallelism combined with $(SPATCH_BATCH_SIZE) and/or the > "--jobs" argument to "spatch(1)". > > It also makes us *much* faster when incrementally building, it's now > viable to "make coccicheck" as topic branches are merged down. > > The rule didn't use FORCE (or its equivalents) before, so a: > > make coccicheck > make coccicheck > > Would report nothing to do on the second iteration. But all of our > patch output depended on all $(COCCI_SOURCES) files, therefore e.g.: > > make -W grep.c coccicheck > > Would do a full re-run, i.e. a a change in a single file would force > us to do a full re-run. > > The reason for this (not the initial rationale, but my analysis) is: > > * Since we create a single "*.cocci.patch+" we don't know where to > pick up where we left off, or how to incrementally merge e.g. a > "grep.c" change with an existing *.cocci.patch. > > * We've been carrying forward the dependency on the *.c files since > 63f0a758a06 (add coccicheck make target, 2016-09-15) the rule was > initially added as a sort of poor man's dependency discovery. > > As we don't include other *.c files depending on other *.c files > has always been broken, as could be trivially demonstrated > e.g. with: > > make coccicheck > make -W strbuf.h coccicheck > > However, depending on the corresponding *.c files has been doing > something, namely that *if* an API change modified both *.c and *.h > files we'd catch the change to the *.h we care about via the *.c > being changed. > > For API changes that happened only via *.h files we'd do the wrong > thing before this change, but e.g. for function additions (not > "static inline" ones) catch the *.h change by proxy. > > Now we'll instead: > > * Create a <RULE>/<FILE> pair in the .build directory, E.g. for > swap.cocci and grep.c we'll create > .build/contrib/coccinelle/swap.cocci.patch/grep.c. Indeed, and the output of 'make coccicheck' with this patch looks like this: MKDIR -p .build/contrib/coccinelle/hashmap.cocci.patch SPATCH .build/contrib/coccinelle/hashmap.cocci.patch/upload-pack.c SPATCH .build/contrib/coccinelle/hashmap.cocci.patch/merge-ort-wrappers.c SPATCH .build/contrib/coccinelle/hashmap.cocci.patch/unpack-trees.c SPATCH .build/contrib/coccinelle/hashmap.cocci.patch/gpg-interface.c SPATCH .build/contrib/coccinelle/hashmap.cocci.patch/linear-assignment.c This is misleading, because it appears as 'spatch' would generate C source files, but that's not the case, of course. > That file is the diff we'll apply for that <RULE>-<FILE> > combination, if there's no changes to me made (the common case) > it'll be an empty file. > > * Our generated *.patch > file (e.g. contrib/coccinelle/swap.cocci.patch) is now a simple "cat > $^" of all of all of the <RULE>/<FILE> files for a given <RULE>. > > In the case discussed above of "grep.c" being changed we'll do the > full "cat" every time, so they resulting *.cocci.patch will always > be correct and up-to-date, even if it's "incrementally updated". > > See 1cc0425a27c (Makefile: have "make pot" not "reset --hard", > 2022-05-26) for another recent rule that used that technique. > > As before we'll: > > * End up generating a contrib/coccinelle/swap.cocci.patch, if we > "fail" by creating a non-empty patch we'll still exit with a zero > exit code. > > Arguably we should move to a more Makefile-native way of doing > this, i.e. fail early, and if we want all of the "failed" changes > we can use "make -k", but as the current > "ci/run-static-analysis.sh" expects us to behave this way let's > keep the existing behavior of exhaustively discovering all cocci > changes, and only failing if spatch itself errors out. > > Further implementation details & notes: > > * Before this change running "make coccicheck" would by default end > up pegging just one CPU at the very end for a while, usually as > we'd finish whichever *.cocci rule was the most expensive. > > This could be mitigated by combining "make -jN" with > SPATCH_BATCH_SIZE, see 960154b9c17 (coccicheck: optionally batch > spatch invocations, 2019-05-06). > > There will be cases where getting rid of "SPATCH_BATCH_SIZE" makes > things worse, but a from-scratch "make coccicheck" with the default > of SPATCH_BATCH_SIZE=1 (and tweaking it doesn't make a difference) > is faster (~3m36s v.s. ~3m56s) with this approach, as we can feed > the CPU more work in a less staggered way. > > * Getting rid of "SPATCH_BATCH_SIZE" particularly helps in cases > where the default of 1 yields parallelism under "make coccicheck", > but then running e.g.: > > make -W contrib/coccinelle/swap.cocci coccicheck > > I.e. before that would use only one CPU core, until the user > remembered to adjust "SPATCH_BATCH_SIZE" differently than the > setting that makes sense when doing a non-incremental run of "make > coccicheck". > > * Before the "make coccicheck" rule would have to clean > "contrib/coccinelle/*.cocci.patch*", since we'd create "*+" and > "*.log" files there. Now those are created in > .build/contrib/coccinelle/, which is covered by the "cocciclean" rule > already. > > Outstanding issues & future work: > > * We could get rid of "--all-includes" in favor of manually > specifying a list of includes to give to "spatch(1)". > > As noted upthread of [1] a naïve removal of "--all-includes" will > result in broken *.cocci patches, but if we know the exhaustive > list of includes via COMPUTE_HEADER_DEPENDENCIES we don't need to > re-scan for them, we could grab the headers to include from the > .depend.d/<file>.o.d and supply them with the "--include" option to > spatch(1).q > > 1. https://lore.kernel.org/git/87ft18tcog.fsf@xxxxxxxxxxxxxxxxxxx/ > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > --- > Makefile | 81 ++++++++++++++++++++++++----------- > contrib/coccinelle/.gitignore | 2 +- > shared.mak | 3 +- > 3 files changed, 58 insertions(+), 28 deletions(-) > > diff --git a/Makefile b/Makefile > index 25f0c7e837d..4bedff26285 100644 > --- a/Makefile > +++ b/Makefile > @@ -1306,10 +1306,6 @@ SANITIZE_ADDRESS = > SPATCH_INCLUDE_FLAGS = --all-includes > SPATCH_FLAGS = > SPATCH_TEST_FLAGS = > -# Setting SPATCH_BATCH_SIZE higher will > -# usually result in less CPU usage at the cost of higher peak memory. > -# Setting it to 0 will feed all files in a single spatch invocation. > -SPATCH_BATCH_SIZE = 1 > > # Rebuild 'coccicheck' if $(SPATCH), its flags etc. change > TRACK_SPATCH_DEFINES = > @@ -1317,7 +1313,6 @@ TRACK_SPATCH_DEFINES += $(SPATCH) > TRACK_SPATCH_DEFINES += $(SPATCH_INCLUDE_FLAGS) > TRACK_SPATCH_DEFINES += $(SPATCH_FLAGS) > TRACK_SPATCH_DEFINES += $(SPATCH_TEST_FLAGS) > -TRACK_SPATCH_DEFINES += $(SPATCH_BATCH_SIZE) > GIT-SPATCH-DEFINES: FORCE > @FLAGS='$(TRACK_SPATCH_DEFINES)'; \ > if test x"$$FLAGS" != x"`cat GIT-SPATCH-DEFINES 2>/dev/null`" ; then \ > @@ -3163,6 +3158,7 @@ check: $(GENERATED_H) > > COCCI_GLOB = $(wildcard contrib/coccinelle/*.cocci) > COCCI_RULES = $(COCCI_GLOB) > +COCCI_NAMES = $(COCCI_RULES:contrib/coccinelle/%.cocci=%) > > COCCICHECK_PENDING = $(filter %.pending.cocci,$(COCCI_RULES)) > COCCICHECK = $(filter-out $(COCCICHECK_PENDING),$(COCCI_RULES)) > @@ -3170,32 +3166,65 @@ COCCICHECK = $(filter-out $(COCCICHECK_PENDING),$(COCCI_RULES)) > COCCICHECK_PATCHES = $(COCCICHECK:%=%.patch) > COCCICHECK_PATCHES_PENDING = $(COCCICHECK_PENDING:%=%.patch) > > +# It's expensive to compute the many=many rules below, only eval them > +# on $(MAKECMDGOALS) that match these $(COCCI_RULES) > +COCCI_RULES_GLOB = > +COCCI_RULES_GLOB += cocci% > +COCCI_RULES_GLOB += .build/contrib/coccinelle/% > +COCCI_RULES_GLOB += $(COCCICHECK_PATCHES) > +COCCI_RULES_GLOB += $(COCCICHEC_PATCHES_PENDING) > +COCCI_GOALS = $(filter $(COCCI_RULES_GLOB),$(MAKECMDGOALS)) > + > COCCI_TEST_RES = $(wildcard contrib/coccinelle/tests/*.res) > > -COCCI_PATCHES = $(COCCI_RULES:%=%.patch) > -$(COCCI_PATCHES): GIT-SPATCH-DEFINES > -$(COCCI_PATCHES): $(COCCI_SOURCES) > -$(COCCI_PATCHES): %.patch: % > - $(QUIET_SPATCH) \ > - if test $(SPATCH_BATCH_SIZE) = 0; then \ > - limit=; \ > - else \ > - limit='-n $(SPATCH_BATCH_SIZE)'; \ > - fi; \ > - if ! echo $(COCCI_SOURCES) | xargs $$limit \ > - $(SPATCH) $(SPATCH_FLAGS) \ > - $(SPATCH_INCLUDE_FLAGS) \ > - --sp-file $< --patch . \ > - >$@+ 2>$@.log; \ > +.build/contrib/coccinelle/FOUND_H_SOURCES: $(FOUND_H_SOURCES) > + $(call mkdir_p_parent_template) > + $(QUIET_GEN) >$@ > + > +define cocci-rule > + > +## Rule for .build/$(1).patch/$(2); Params: > +# $(1) = e.g. "free.cocci" > +# $(2) = e.g. "grep.c" > +COCCI_$(1:contrib/coccinelle/%.cocci=%) += .build/$(1).patch/$(2) > +.build/$(1).patch/$(2): GIT-SPATCH-DEFINES > +.build/$(1).patch/$(2): .build/contrib/coccinelle/FOUND_H_SOURCES > +.build/$(1).patch/$(2): $(1) > +.build/$(1).patch/$(2): .build/$(1).patch/% : % > + $$(call mkdir_p_parent_template) > + $$(QUIET_SPATCH)if ! $$(SPATCH) $$(SPATCH_FLAGS) \ > + $$(SPATCH_INCLUDE_FLAGS) \ > + --sp-file $(1) --patch . $$< \ > + >$$@ 2>$$@.log; \ > then \ > - cat $@.log; \ > + echo "ERROR when applying '$(1)' to '$$<'; '$$@.log' follows:"; \ > + cat $$@.log; \ > exit 1; \ > - fi; \ > - mv $@+ $@; \ > - if test -s $@; \ > + fi > +endef > + > +define cocci-matrix > + > +$(foreach s,$(COCCI_SOURCES),$(call cocci-rule,$(1),$(s))) > +endef > + > +ifdef COCCI_GOALS > +$(eval $(foreach c,$(COCCI_RULES),$(call cocci-matrix,$(c)))) > +endif > + > +define spatch-rule > + > +contrib/coccinelle/$(1).cocci.patch: $$(COCCI_$(1)) > + $$(QUIET_SPATCH_CAT)cat $$^ >$$@ && \ > + if test -s $$@; \ > then \ > - echo ' ' SPATCH result: $@; \ > + echo ' ' SPATCH result: $$@; \ > fi > +endef > + > +ifdef COCCI_GOALS > +$(eval $(foreach n,$(COCCI_NAMES),$(call spatch-rule,$(n)))) > +endif > > COCCI_TEST_RES_GEN = $(addprefix .build/,$(COCCI_TEST_RES)) > $(COCCI_TEST_RES_GEN): GIT-SPATCH-DEFINES > @@ -3487,7 +3516,7 @@ profile-clean: > cocciclean: > $(RM) GIT-SPATCH-DEFINES > $(RM) -r .build/contrib/coccinelle > - $(RM) contrib/coccinelle/*.cocci.patch* > + $(RM) contrib/coccinelle/*.cocci.patch > > clean: profile-clean coverage-clean cocciclean > $(RM) -r .build > diff --git a/contrib/coccinelle/.gitignore b/contrib/coccinelle/.gitignore > index d3f29646dc3..1d45c0a40c8 100644 > --- a/contrib/coccinelle/.gitignore > +++ b/contrib/coccinelle/.gitignore > @@ -1 +1 @@ > -*.patch* > +*.patch > diff --git a/shared.mak b/shared.mak > index 96b06acc455..f437073e48c 100644 > --- a/shared.mak > +++ b/shared.mak > @@ -71,8 +71,9 @@ ifndef V > QUIET_RC = @echo ' ' RC $@; > > ## Used in "Makefile": SPATCH > - QUIET_SPATCH = @echo ' ' SPATCH $<; > + QUIET_SPATCH = @echo ' ' SPATCH $@; > QUIET_SPATCH_TEST = @echo ' ' SPATCH TEST $(@:.build/%=%); > + QUIET_SPATCH_CAT = @echo ' ' SPATCH CAT $$^ \>$@; > > ## Used in "Documentation/Makefile" > QUIET_ASCIIDOC = @echo ' ' ASCIIDOC $@; > -- > 2.38.0.1280.g8136eb6fab2 >