Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > I've often wondered why "make coccicheck" takes so long. This change > speeds it up by ~2x and makes it use much less memory. Or a reduction > of a max of around ~2GB per-process (under the old > SPATCH_BATCH_SIZE=0) to around ~200MB. > > Running the full "make coccicheck" now takes ~50 seconds with -j8 on > my machine, v.s. ~2x of that before. I've got 64GB of memory on that > machine, or it would be much slower. > > Why has it been so slow? Because I think we've always been running it > in entirely the wrong mode for what we wanted, and much of the > previous fixing of this target has involved re-arranging the deck > chairs on that particular Titanic. > > What we really want to do with coccicheck is to do search/replacements > in all our *.c and *.h files. This is now what we do, and we'll > process a default of 64 files at a time. > > What we were doing before was processing all our *.c files, and for > each of those *.c files we'd recursively look around for includes and > see if we needed to search/replace in those too. > > That we did that dates back to [1] when we were only processing *.c > files, and it was always very redundant. We'd e.g. visit the likes of > strbuf.h lots of times since it's widely used as an include. > > Then in the most recent attempt to optimize coccicheck in [2] this > anti-pattern finally turned into a bug. > > Namely: before this change, if your coccicheck rule applied to > e.g. making a change in strbuf.h itself we'd get *lots* of duplicate > hunks applying the exact same change, as concurrent spatch processes > invoked by xargs raced one another. In one instance I ended up with 27 > copies of the same hunk in a strbuf.patch. > > Setting SPATCH_BATCH_SIZE=0 and processing all the files in one giant > batch mitigated this. I suspect the author of [2] either mostly ran in > that mode, or didn't test on changes that impacted widely used header > files. > > So since we're going to want to process all our *.c and *.h let's just > do that, and drop --all-includes for --no-includes. It's not spatch's > job to find our sources, we're doing that. If someone is manually > tweaking COCCI_SOURCES they can just tweak SPATCH_FLAGS too. > > I'm entirely removing SPATCH_BATCH_SIZE. If you want to tweak it you > can tweak SPATCH_XARGS_FLAGS to e.g. "-n 256", or "-P 4 -n 128". But > in my testing it isn't worth it to tweak SPATCH_XARGS_FLAGS for a full > "make coccicheck". > > I'm also the whole "cat $@.log" introduced in [3]. Since we don't call > this in a loop anymore (and xargs will early-exit) we can just rely on > standard V=1 for debugging issues. > > 1. a9a884aea5 (coccicheck: use --all-includes by default, 2016-09-30) > 2. 960154b9c1 (coccicheck: optionally batch spatch invocations, > 2019-05-06) > 3. f5c2bc2b96 (Makefile: detect errors in running spatch, 2017-03-10) > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > --- Nice, so in short, we've been redundantly running the checker code over and over on the same header files wasting cycles. Even though I saw you mentioned something about preparing for a reroll, I'll tentatively queue this version to 'seen' for now. THanks.