Re: coccinelle: improve array.cocci

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 21, 2019 at 08:44:12PM +0100, Markus Elfring wrote:
> The program “spatch” supports parallelisation also directly by the parameter “--jobs”.
> Did you try it out occasionally?

I did try --jobs on a couple of occasions, and the results always
varied between broken, not working, or downright making things even
slower.


  $ spatch --version
  spatch version 1.0.4 with Python support and with PCRE support
  $ spatch --sp-file contrib/coccinelle/array.cocci --all-includes --patch . --jobs 2 alias.c alloc.c
  init_defs_builtins: /usr/lib/coccinelle/standard.h
  HANDLING: alias.c alloc.c
  Fatal error: exception Sys_error("array: No such file or directory")

This issue seems to be fixed in later versions, but this is the
version what many distros still ship and what is used in our CI
builds, so we do care about 1.0.4.


  $ spatch --version
  spatch version 1.0.8 compiled with OCaml version 4.05.0
  Flags passed to the configure script: [none]
  OCaml scripting support: yes
  Python scripting support: yes
  Syntax of regular expressions: PCRE
  $ /usr/bin/time --format='%e | %M' make contrib/coccinelle/array.cocci.patch
      SPATCH contrib/coccinelle/array.cocci
  102.06 | 129084

Our Makefile recipes run Coccinelle in a sequential loop, one 'spatch'
invocation for each source file by default.  Therefore, merely passing
in '--jobs <N>' doesn't bring any runtime benefits:

  $ /usr/bin/time --format='%e | %M' make SPATCH_FLAGS='--all-includes --patch . --jobs 8' contrib/coccinelle/array.cocci.patch
      SPATCH contrib/coccinelle/array.cocci
  105.31 | 118512

Some time ago we found that invoking 'spatch' with multiple files at
once does bring notable speedup (with 1.0.4), although at the cost of
drastically increased memory footprint, see commit 960154b9c1
(coccicheck: optionally batch spatch invocations, 2019-05-06).  Alas,
trying to use that in the hope that 'spatch' can do more in parallel
if it has more files to process at once doesn't bring any runtime
benefits, either:

  $ /usr/bin/time --format='%e | %M' make SPATCH_FLAGS='--all-includes --patch . --jobs 8' SPATCH_BATCH_SIZE=8 contrib/coccinelle/array.cocci.patch
      SPATCH contrib/coccinelle/array.cocci
  116.27 | 349964

And by further increasing the batch size it just gets notably slower;
also note the order of magnitude higher max memory usage:

  $ /usr/bin/time --format='%e | %M' make SPATCH_FLAGS='--all-includes --patch . --jobs 8' SPATCH_BATCH_SIZE=32 contrib/coccinelle/array.cocci.patch
      SPATCH contrib/coccinelle/array.cocci
  197.70 | 1205784

It appears that batching 'spatch' invocations with 1.0.8 does not
bring the same benefits as with 1.0.4, but brings slowdowns instead...

Anyway, looking at 'ps u -L' output it appears that 'spatch' doesn't
really do any parallel work, and there are only two 'spatch' processes
and no threads despite '--jobs 8':

  szeder    2561  0.4  0.5  36944 21520 pts/0    S+   15:31   0:00 spatch
  szeder    2567 97.1 30.5 1228372 1205332 pts/0 R+   15:31   0:29 spatch


Note that 1.0.8 above was run in a Docker container, while 1.0.4 on
the host.  This may or may not have influenced the runtimes reported
above.  FWIW, 'make -j4 coccicheck' parallelizes just fine even in the
container and with 1.0.8.


A different approach relying on 'make -j' to parallelize 'spatch'
invocations was discussed here:

  https://public-inbox.org/git/20180802115522.16107-1-szeder.dev@xxxxxxxxx/T/#u




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux