A v4 (I'm counting the 5/4 patch I sent to v2 as a v3) which produces the exact same end result as that v2 + 5/4 patch, but with a rewritten commit message/squash as requested by Junio. Ævar Arnfjörð Bjarmason (4): Makefile/coccicheck: add comment heading for all SPATCH flags Makefile/coccicheck: speed up and fix bug with duplicate hunks Makefile/coccicheck: allow for setting xargs concurrency Makefile/coccicheck: set SPATCH_BATCH_SIZE to 8 Makefile | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) Range-diff: 1: d3a2df76b04 = 1: d8c6efd2464 Makefile/coccicheck: add comment heading for all SPATCH flags 2: 6b0901e0fc5 ! 2: 3bca3239cb8 Makefile/coccicheck: speed up and fix bug with duplicate hunks @@ Commit message Makefile/coccicheck: speed up and fix bug with duplicate hunks Change the coccicheck target to run on all of our *.c and *.h files - with --no-includes, instead of only on the *.c files with - --all-includes. - - This speeds it up significantly and reduces its memory usage, since it - doesn't need to parse N includes for every file it visits. - - See [1] for a discussion thread about this commit with some timings - for details, but briefly: This change speeds it up by ~2x and makes it - use much less memory. Or a reduction of a max of around ~2GB - per-process (under the old SPATCH_BATCH_SIZE=0) to around ~200MB. - - Looking at the history of the coccicheck target I think we've always - been running it in the wrong mode for what we wanted to achieve: - - - When it was added in 63f0a758a06 (add coccicheck make target, - 2016-09-15) it explicitly processed only the %.c files. - - - Then in a9a884aea57 (coccicheck: use --all-includes by default, - 2016-09-30) it started processing the %.h files by looking around for - its own includes. - - Let's instead just point it to both our *.c and *.h files, then - there's no need to have it recursively look around for included files - to change. - - Setting --no-includes would not work if we expected to set - COCCI_SOURCES to a subset of our source files, but that's not what - we're doing here. If anyone manually tweaks COCCI_SOURCES they'll now - need to tweak SPATCH_FLAGS too. The speed and correctness we gain is - worth that small trade-off. - - Using --no-includes also fixes a subtle bug introduced in - 960154b9c17 (coccicheck: optionally batch spatch invocations, - 2019-05-06) with duplicate hunks being written to the - generated *.patch files. - - This is because that change altered a file-at-a-time for-loop to an - invocation of "xargs -n X". This would not matter for most other - programs, but it matters for spatch. - - This is because each spatch invocation will maintain shared lock files - in /tmp, check if files being parsed were changed etc. I haven't dug - into why exactly, but it's easy to reproduce the issue[2]. The issue - goes away entirely if we just use --no-includes, which as noted above - would have made sense even without that issue. - - 1. https://lore.kernel.org/git/20210302205103.12230-1-avarab@xxxxxxxxx/ - 2. A session showing racy spatch under xargs -n X: - - $ cat test.cocci - @@ - expression E1; - @@ - - strbuf_avail(E1) - + strbuf_has(E1) - $ for i in 1 2 4 16 64 128 512 - do - echo with xargs -n $i: && - echo *.c | xargs -n $i spatch --sp-file \ - test.cocci --all-includes --patch . 2>/dev/null | \ - grep -F +++ | sort | uniq -c - done - with xargs -n 1: - 1 +++ b/convert.c - 1 +++ b/strbuf.c - with xargs -n 2: - 1 +++ b/convert.c - 1 +++ b/strbuf.c - with xargs -n 4: - 1 +++ b/convert.c - 1 +++ b/strbuf.c - with xargs -n 16: - 1 +++ b/convert.c - 1 +++ b/strbuf.c - 2 +++ b/strbuf.h - with xargs -n 64: - 1 +++ b/convert.c - 1 +++ b/strbuf.c - 2 +++ b/strbuf.h - with xargs -n 128: - 1 +++ b/convert.c - 1 +++ b/strbuf.c - 2 +++ b/strbuf.h - with xargs -n 512: - 1 +++ b/convert.c - 1 +++ b/strbuf.c - 1 +++ b/strbuf.h + with --include-headers-for-types, instead of trusting it to find *.h + files and other includes to modify from its recursive walking of + includes as it has been doing with only --all-includes. + + The --all-includes option introduced in a9a884aea57 (coccicheck: use + --all-includes by default, 2016-09-30) is needed because we have rules + that e.g. use the "type T" construct that wouldn't match unless we + scoured our headers for the definition of the relevant type. + + But combining --all-includes it with processing N files at a time with + xargs as we've done since 960154b9c17 (coccicheck: optionally batch + spatch invocations, 2019-05-06) introduced a subtle bug with duplicate + hunks being written to the generated *.patch files. I did not dig down + to the root cause of that, but I think it's due to spatch doing (and + failing to do) some magical locking/racy mtime checking to decide if + it emits a hunk. See [1] for a way to reproduce the issue, and a + discussion of it. + + This change speeds up the runtime of "make -j8 coccicheck" from ~1m50s + to ~1m20s for me. See [2] for more timings. + + We could also use --no-includes for a runtime of ~55s, but that would + produce broken patches (we miss some hunks) in cases where we need the + type or other definitions from included files. + + If anyone cares there's probably an optimization opportunity here to + e.g. detect that the whole file doesn't need to consider includes, + since the rules would be unambiguous without considering them. + + Note on portability: The --include-headers-for-types option is not in + my "man spatch", but it's been part of spatch since 2014. See its + fe3a327a (include headers for types option, 2014-07-27), it was first + released with version 1.0.0 of spatch, released in April 2015. The + spatch developers are just inconsistent about updating their TeX docs + and manpages at the same time. + + 1. https://lore.kernel.org/git/20210305170724.23859-3-avarab@xxxxxxxxx/ + 2. https://lore.kernel.org/git/20210306192525.15197-1-avarab@xxxxxxxxx/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> @@ Makefile: SPARSE_FLAGS ?= # For the 'coccicheck' target -SPATCH_FLAGS = --all-includes --patch . -+SPATCH_FLAGS = --no-includes --patch . ++SPATCH_FLAGS = --all-includes --include-headers-for-types --patch . + # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will # usually result in less CPU usage at the cost of higher peak memory. 3: c913d5dbe9b ! 3: 9d5814dacdc Makefile/coccicheck: allow for setting xargs concurrency @@ Commit message Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> ## Makefile ## -@@ Makefile: SPATCH_FLAGS = --no-includes --patch . +@@ Makefile: SPATCH_FLAGS = --all-includes --include-headers-for-types --patch . # Setting it to 0 will feed all files in a single spatch invocation. SPATCH_BATCH_SIZE = 1 4: 98225e65d30 ! 4: 51b782bb9b6 Makefile/coccicheck: set SPATCH_BATCH_SIZE to 8 @@ Commit message Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> ## Makefile ## -@@ Makefile: SPATCH_FLAGS = --no-includes --patch . +@@ Makefile: SPATCH_FLAGS = --all-includes --include-headers-for-types --patch . # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will # usually result in less CPU usage at the cost of higher peak memory. # Setting it to 0 will feed all files in a single spatch invocation. 5: 60649abddbb < -: ----------- Makefile/coccicheck: use --include-headers-for-types -- 2.31.0.285.gb40d23e604f