On Thu, Aug 02 2018, Jeff King wrote: > On Thu, Aug 02, 2018 at 01:55:22PM +0200, SZEDER Gábor wrote: > >> Let's add a bit of Makefile metaprogramming to generate finer-grained >> make targets applying one semantic patch to only a single source file, >> and specify these as dependencies of the targets applying one semantic >> patch to all source files. This way that shell loop can be avoided, >> semantic patches will only be applied to changed source files, and the >> same semantic patch can be applied in parallel to multiple source >> files. The only remaining sequential part is aggregating the >> suggested transformations from the individual targets into a single >> patch file, which is comparatively cheap (especially since ideally >> there aren't any suggestions). >> >> This change brings spectacular speedup in the scenario described in >> point (1) above. When the results of a previous 'make coccicheck' are >> there, the time needed to run >> >> touch apply.c ; time make -j4 coccicheck >> >> went from 3m42s to 1.848s, which is just over 99% speedup, yay!, and >> 'apply.c' is the second longest source file in our codebase. In the >> scenario in point (2), running >> >> touch contrib/coccinelle/array.cocci ; time make -j4 coccicheck >> >> went from 56.364s to 35.883s, which is ~36% speedup. > > I really like this direction. The slowness of coccicheck is one of the > things that has prevented me from running it more frequently. And I'm a > big fan of breaking steps down as much as possible into make targets. It > lets make do its job (avoiding repeated work and parallelizing). Yeah, this is great. Also, CC-ing some of the recent contributors to linux.git's coccinelle, perhaps they're interested / have comments. >> All the above timings are best-of-five on a machine with 2 physical >> and 2 logical cores. I don't have the hardware to bring any numbers >> for point (3). The time needed to run 'make -j4 coccicheck' in a >> clean state didn't change, it's ~3m42s both with and without this >> patch. > > On a 40-core (20+20) machine, doing "make -j40 coccicheck" from scratch > went from: > > real 1m25.520s > user 5m41.492s > sys 0m26.776s > > to: > > real 0m24.300s > user 14m35.084s > sys 0m50.108s > > I was surprised by the jump in CPU times. Doing "make -j1 coccicheck" > with your patch results in: > > real 5m34.887s > user 5m5.620s > sys 0m19.908s > > so it's really the parallelizing that seems to be to blame (possibly > because this CPU boosts from 2.3Ghz to 3.0Ghz, and we're only using 8 > threads in the first example). > >> - [RFC] >> With this patch 'make coccicheck's output will look like this >> ('make' apparently doesn't iterate over semantic patches in >> lexicographical order): >> >> SPATCH commit.cocci abspath.c >> SPATCH commit.cocci advice.c >> <... lots of lines ...> >> SPATCH array.cocci http-walker.c >> SPATCH array.cocci remote-curl.c >> >> which means that the number of lines in the output grows from >> "Nr-of-semantic-patches" to "Nr-of-semantic-patches * >> Nr-of-source-files". > > IMHO this is not really that big a deal. We are doing useful work for > each line, so to me it's just more eye candy (and it's really satisfying > to watch it zip by on the 40-core machine ;) ). FWIW on the 8-cpu box I usually test on this went from 2m30s to 1m50s, so not a huge improvement in time, but nice to have the per-file progress. > What if we inverted the current loop? That is, right now we iterate over > the cocci patches at the Makefile level, and then for each target we > iterate over the giant list of source files. Instead, we could teach the > Makefile to iterate over the source files, with a target for each, and > then hit each cocci patch inside there. > > That would give roughly the same output as a normal build. But moreover, > I wonder if we could make things faster by actually combining the cocci > files into a single set of rules to be applied to each source file. That > would mean invoking spatch 1/8 as much. It does give fewer opportunities > for "make" to reuse work, but that only matters when the cocci files > change (which is much less frequent than source files changing). > > Doing: > > cat contrib/coccinelle/*.cocci >mega.cocci > make -j40 coccicheck COCCI_SEM_PATCHES=mega.cocci > > gives me: > > real 0m17.573s > user 10m56.724s > sys 0m11.548s > > And that should show an improvement on more normal-sized systems, too, > because we really are eliminating some of the startup overhead. > > The other obvious downside is that you don't get individual patches for > each class of transformation. Do we care? Certainly for a routine "make > coccicheck" I primarily want to know: > > - is there something that needs fixing? > > - give me the patch for all fixes > > So I wonder if we'd want to have that be the default, and then perhaps > optionally give some targets to let people run single scripts (or not; > they could probably just run spatch themselves). > >> - [RFC] >> The overhead of applying a semantic patch to all source files >> became larger. 'make coccicheck' currently runs only one shell >> process and creates two output files for each semantic patch. >> With this patch it will run approx. "Nr-of-semantic-patches * >> Nr-of-source-files" shell processes and create twice as many >> output files. > > Doing the "one big .cocci" would help with this, too (and again puts us > in the same ballpark as a compile). > >> - [RFC] >> This approach uses $(eval), which we haven't used in any of our >> Makefiles yet. I wonder whether it's portable enough. And it's >> ugly and complicated. > > I looked into this a long time ago for another set of Makefile patches I > was considering. $(eval) was added to GNU make in 3.80, released in > 2002. Which is probably fine by now. > > If it isn't, one more exotic option would be to push the coccicheck > stuff into its own Makefile, and just run it via recursive make. Then > anybody doing a vanilla build can do so even with an antique make, but > you could only "make coccicheck" with something from the last 16 years > (but good luck getting ocaml running there). > > I suspect if we go with the one-spatch-per-source route, though, that we > could do this just with regular make rules. > > -Peff