On Tue, Oct 02, 2018 at 03:55:19PM -0400, Jeff King wrote: > On Tue, Oct 02, 2018 at 12:16:42PM -0700, Jacob Keller wrote: > > > make coccicheck is used in order to apply coccinelle semantic patches, > > and see if any of the transformations found within contrib/coccinelle/ > > can be applied to the current code base. > > > > Pass every file to a single invocation of spatch, instead of running > > spatch once per source file. > > > > This reduces the time required to run make coccicheck by a significant > > amount of time: > > > > Prior timing of make coccicheck > > real 6m14.090s > > user 25m2.606s > > sys 1m22.919s > > > > New timing of make coccicheck > > real 1m36.580s > > user 7m55.933s > > sys 0m18.219s > > Yay! This is a nice result. > > It's also one of the things that Julia suggested in an earlier thread. > One thing I wasn't quite sure about after digging into the various > versions (1.0.4 on Debian stable/unstable, 1.0.6 in experimental, and > pre-1.0.7 at the bleeding edge) was whether the old versions would be > similarly helped (or work at all). > > I just replicated your results with 1.0.4.deb-3+b2 from Debian stable. > It's possible there are older versions floating around, but for > something developer-only like this, I think "in Debian stable" is a > reasonable enough cutoff. Linux build jobs on Travis CI run Ubuntu Trusty 14.04 LTS, and therefore our static analysis build job still runs 1.0.0~rc19.deb-3. This patch appears to work fine with that version, too, though note that I also changed the build job to don't run two parallel jobs; for the reason see below. > > This is nearly a 4x decrease in the time required to run make > > coccicheck. This is due to the overhead of restarting spatch for every > > file. By processing all files at once, we can amortize this startup cost > > across the total number of files, rather than paying it once per file. > > One minor side effect is that we lose the opportunity to parallelize > quite as much. However, I think the reduction in total CPU makes it well > worth that (and we still have 8 cocci files and growing, which should > keep at least 8 cores busy). One major side effect, however, is the drastically increased memory usage resulting from processing all files at once. With your patch on top of current master: $ for cocci in contrib/coccinelle/*.cocci ; do command time -f 'max RSS: %MkB' make ${cocci}.patch ; done SPATCH contrib/coccinelle/array.cocci max RSS: 1537068kB SPATCH contrib/coccinelle/commit.cocci max RSS: 1510920kB SPATCH contrib/coccinelle/free.cocci max RSS: 1393712kB SPATCH contrib/coccinelle/object_id.cocci SPATCH result: contrib/coccinelle/object_id.cocci.patch max RSS: 1831700kB SPATCH contrib/coccinelle/qsort.cocci max RSS: 1384960kB SPATCH contrib/coccinelle/strbuf.cocci max RSS: 1395800kB SPATCH contrib/coccinelle/swap.cocci max RSS: 1393620kB SPATCH contrib/coccinelle/xstrdup_or_null.cocci max RSS: 1371716kB Without your patch the max RSS lingers around 87048kB - 101980kB, meaning ~15x - 18x increase This might cause quite the surprise to developers who are used to run 'make -jN coccicheck'; my tiny laptop came to a grinding halt with -j4. I think this side effect should be mentioned in the commit message. Furthermore, the above mentioned static analysis build job on Travis CI runs two parallel jobs. Perhaps we should be considerate and reduce that to a single job, even if that means that he build job will run longer. > I think recent versions of Coccinelle will actually parallelize > internally, too, but my 1.0.4 doesn't seem to. That's probably what I > was thinking of earlier (but this is still a win even without that). I think Coccinelle parallelizes only when invoked with -j <N>, but that option is not documented in 1.0.4. I wrote "documented" instead of "present", because e.g.: $ spatch --sp-file contrib/coccinelle/swap.cocci -j 2 a*.c doesn't abort with "unknown option" and usage, but runs for a bit and then outputs: init_defs_builtins: /usr/lib/coccinelle/standard.h HANDLING: abspath.c advice.c alias.c alloc.c apply.c archive.c archive-tar.c archive-zip.c argv-array.c attr.c Fatal error: exception Sys_error("swap: No such file or directory") Without '-j 2' the command runs just fine. I don't know what to make of it. > It looks like this also fixes a problem I ran into when doing the oideq > work, which is that the patch for a header file would get shown multiple > times (once for each file that includes it). So this is faster _and_ > more correct. Double-yay. In that case I used to apply the change to the header first, and then apply the semantic patch one more time.