[PATCH v4 0/4] Makefile/coccicheck: fix bugs and speed it up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



A v4 (I'm counting the 5/4 patch I sent to v2 as a v3) which produces
the exact same end result as that v2 + 5/4 patch, but with a rewritten
commit message/squash as requested by Junio.

Ævar Arnfjörð Bjarmason (4):
  Makefile/coccicheck: add comment heading for all SPATCH flags
  Makefile/coccicheck: speed up and fix bug with duplicate hunks
  Makefile/coccicheck: allow for setting xargs concurrency
  Makefile/coccicheck: set SPATCH_BATCH_SIZE to 8

 Makefile | 34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

Range-diff:
1:  d3a2df76b04 = 1:  d8c6efd2464 Makefile/coccicheck: add comment heading for all SPATCH flags
2:  6b0901e0fc5 ! 2:  3bca3239cb8 Makefile/coccicheck: speed up and fix bug with duplicate hunks
    @@ Commit message
         Makefile/coccicheck: speed up and fix bug with duplicate hunks
     
         Change the coccicheck target to run on all of our *.c and *.h files
    -    with --no-includes, instead of only on the *.c files with
    -    --all-includes.
    -
    -    This speeds it up significantly and reduces its memory usage, since it
    -    doesn't need to parse N includes for every file it visits.
    -
    -    See [1] for a discussion thread about this commit with some timings
    -    for details, but briefly: This change speeds it up by ~2x and makes it
    -    use much less memory. Or a reduction of a max of around ~2GB
    -    per-process (under the old SPATCH_BATCH_SIZE=0) to around ~200MB.
    -
    -    Looking at the history of the coccicheck target I think we've always
    -    been running it in the wrong mode for what we wanted to achieve:
    -
    -     - When it was added in 63f0a758a06 (add coccicheck make target,
    -       2016-09-15) it explicitly processed only the %.c files.
    -
    -     - Then in a9a884aea57 (coccicheck: use --all-includes by default,
    -       2016-09-30) it started processing the %.h files by looking around for
    -       its own includes.
    -
    -    Let's instead just point it to both our *.c and *.h files, then
    -    there's no need to have it recursively look around for included files
    -    to change.
    -
    -    Setting --no-includes would not work if we expected to set
    -    COCCI_SOURCES to a subset of our source files, but that's not what
    -    we're doing here. If anyone manually tweaks COCCI_SOURCES they'll now
    -    need to tweak SPATCH_FLAGS too. The speed and correctness we gain is
    -    worth that small trade-off.
    -
    -    Using --no-includes also fixes a subtle bug introduced in
    -    960154b9c17 (coccicheck: optionally batch spatch invocations,
    -    2019-05-06) with duplicate hunks being written to the
    -    generated *.patch files.
    -
    -    This is because that change altered a file-at-a-time for-loop to an
    -    invocation of "xargs -n X". This would not matter for most other
    -    programs, but it matters for spatch.
    -
    -    This is because each spatch invocation will maintain shared lock files
    -    in /tmp, check if files being parsed were changed etc. I haven't dug
    -    into why exactly, but it's easy to reproduce the issue[2]. The issue
    -    goes away entirely if we just use --no-includes, which as noted above
    -    would have made sense even without that issue.
    -
    -    1. https://lore.kernel.org/git/20210302205103.12230-1-avarab@xxxxxxxxx/
    -    2. A session showing racy spatch under xargs -n X:
    -
    -        $ cat test.cocci
    -        @@
    -        expression E1;
    -        @@
    -        - strbuf_avail(E1)
    -        + strbuf_has(E1)
    -        $ for i in 1 2 4 16 64 128 512
    -        do
    -            echo with xargs -n $i: &&
    -            echo *.c | xargs -n $i spatch --sp-file \
    -                test.cocci --all-includes --patch . 2>/dev/null | \
    -            grep -F +++ | sort | uniq -c
    -        done
    -        with xargs -n 1:
    -              1 +++ b/convert.c
    -              1 +++ b/strbuf.c
    -        with xargs -n 2:
    -              1 +++ b/convert.c
    -              1 +++ b/strbuf.c
    -        with xargs -n 4:
    -              1 +++ b/convert.c
    -              1 +++ b/strbuf.c
    -        with xargs -n 16:
    -              1 +++ b/convert.c
    -              1 +++ b/strbuf.c
    -              2 +++ b/strbuf.h
    -        with xargs -n 64:
    -              1 +++ b/convert.c
    -              1 +++ b/strbuf.c
    -              2 +++ b/strbuf.h
    -        with xargs -n 128:
    -              1 +++ b/convert.c
    -              1 +++ b/strbuf.c
    -              2 +++ b/strbuf.h
    -        with xargs -n 512:
    -              1 +++ b/convert.c
    -              1 +++ b/strbuf.c
    -              1 +++ b/strbuf.h
    +    with --include-headers-for-types, instead of trusting it to find *.h
    +    files and other includes to modify from its recursive walking of
    +    includes as it has been doing with only --all-includes.
    +
    +    The --all-includes option introduced in a9a884aea57 (coccicheck: use
    +    --all-includes by default, 2016-09-30) is needed because we have rules
    +    that e.g. use the "type T" construct that wouldn't match unless we
    +    scoured our headers for the definition of the relevant type.
    +
    +    But combining --all-includes it with processing N files at a time with
    +    xargs as we've done since 960154b9c17 (coccicheck: optionally batch
    +    spatch invocations, 2019-05-06) introduced a subtle bug with duplicate
    +    hunks being written to the generated *.patch files. I did not dig down
    +    to the root cause of that, but I think it's due to spatch doing (and
    +    failing to do) some magical locking/racy mtime checking to decide if
    +    it emits a hunk. See [1] for a way to reproduce the issue, and a
    +    discussion of it.
    +
    +    This change speeds up the runtime of "make -j8 coccicheck" from ~1m50s
    +    to ~1m20s for me. See [2] for more timings.
    +
    +    We could also use --no-includes for a runtime of ~55s, but that would
    +    produce broken patches (we miss some hunks) in cases where we need the
    +    type or other definitions from included files.
    +
    +    If anyone cares there's probably an optimization opportunity here to
    +    e.g. detect that the whole file doesn't need to consider includes,
    +    since the rules would be unambiguous without considering them.
    +
    +    Note on portability: The --include-headers-for-types option is not in
    +    my "man spatch", but it's been part of spatch since 2014. See its
    +    fe3a327a (include headers for types option, 2014-07-27), it was first
    +    released with version 1.0.0 of spatch, released in April 2015. The
    +    spatch developers are just inconsistent about updating their TeX docs
    +    and manpages at the same time.
    +
    +    1. https://lore.kernel.org/git/20210305170724.23859-3-avarab@xxxxxxxxx/
    +    2. https://lore.kernel.org/git/20210306192525.15197-1-avarab@xxxxxxxxx/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
     
    @@ Makefile: SPARSE_FLAGS ?=
      
      # For the 'coccicheck' target
     -SPATCH_FLAGS = --all-includes --patch .
    -+SPATCH_FLAGS = --no-includes --patch .
    ++SPATCH_FLAGS = --all-includes --include-headers-for-types --patch .
     +
      # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
      # usually result in less CPU usage at the cost of higher peak memory.
3:  c913d5dbe9b ! 3:  9d5814dacdc Makefile/coccicheck: allow for setting xargs concurrency
    @@ Commit message
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
     
      ## Makefile ##
    -@@ Makefile: SPATCH_FLAGS = --no-includes --patch .
    +@@ Makefile: SPATCH_FLAGS = --all-includes --include-headers-for-types --patch .
      # Setting it to 0 will feed all files in a single spatch invocation.
      SPATCH_BATCH_SIZE = 1
      
4:  98225e65d30 ! 4:  51b782bb9b6 Makefile/coccicheck: set SPATCH_BATCH_SIZE to 8
    @@ Commit message
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
     
      ## Makefile ##
    -@@ Makefile: SPATCH_FLAGS = --no-includes --patch .
    +@@ Makefile: SPATCH_FLAGS = --all-includes --include-headers-for-types --patch .
      # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
      # usually result in less CPU usage at the cost of higher peak memory.
      # Setting it to 0 will feed all files in a single spatch invocation.
5:  60649abddbb < -:  ----------- Makefile/coccicheck: use --include-headers-for-types
-- 
2.31.0.285.gb40d23e604f




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux