Re: [PATCH] coccicheck: optionally batch spatch invocations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 6, 2019 at 4:43 PM Jeff King <peff@xxxxxxxx> wrote:
>
> On Mon, May 06, 2019 at 04:34:09PM +0700, Duy Nguyen wrote:
>
> > > However, it comes at a cost. The RSS of each spatch process goes from
> > > ~50MB to ~1500MB (and peak memory usage may be even higher if make runs
> >
> > 1.5G should be fine. Trying...
> >
> > Even with no -j, my htop's RES column goes up 6GB and put my laptop in
> > "swap every bit of memory out, including the bits handling the screen"
> > mode :( I don't think it was even the peak.
>
> Interesting if you have a different version of spatch. I'm using 1.0.4
> from Debian unstable.
>
> I had just been eyeballing the values in "top" before, but I actually
> measured more carefully. My peak was actually ~1900MB.
>
> > It's probably a bit too much to ask, but is it possible to handle N
> > files at a time (instead of all files), which consumes less memory and
> > runs a bit slower, but still better than the default mode? I can see
> > it already gets tricky doing complicated stuff in Makefile so "no" is
> > perfectly ok.
>
> I almost did this initially but I feared that nobody would actually use
> it. :) So given at least one person who wants it, I took a look. If we
> rely on xargs, then it is really not too bad (and is in fact shorter
> than the current code). I also wrote up a pure-shell version, but it's
> rather verbose even after taking some shortcuts with whitespace
> splitting.
>
> So here's what I think we should apply:
>
> -- >8 --
> Subject: [PATCH] coccicheck: optionally batch spatch invocations
>
> In our "make coccicheck" rule, we currently feed each source file to its
> own individual invocation of spatch. This has a few downsides:
>
>   - it repeats any overhead spatch has for starting up and reading the
>     patch file
>
>   - any included header files may get processed from multiple
>     invocations. This is slow (we see the same header files multiple
>     times) and may produce a resulting patch with repeated hunks (which
>     cannot be applied without further cleanup)
>
> Ideally we'd just invoke a single instance of spatch per rule-file and
> feed it all source files. But spatch can be rather memory hungry when
> run in this way. I measured the peak RSS going from ~90MB for a single
> file to ~1900MB for all files. Multiplied by multiple rule files being
> processed at the same time (for "make -j"), this can make things slower
> or even cause them to fail (e.g., this is reported to happen on our
> Travis builds).
>
> Instead, let's provide a tunable knob. We'll leave the default at "1",
> but it can be cranked up to "999" for maximum CPU/memory tradeoff, or
> people can find points in between that serve their particular machines.
>
> Here are a few numbers running a single rule via:
>
>   SIZES='1 4 16 999'
>   RULE=contrib/coccinelle/object_id.cocci
>   for i in $SIZES; do
>     make clean
>     /usr/bin/time -o $i.out --format='%e | %U | %S | %M' \
>       make $RULE.patch SPATCH_BATCH_SIZE=$i
>   done
>   for i in $SIZES; do
>     printf '%4d | %s\n' $i "$(cat $i.out)"
>   done
>
> which yields:
>
>      1 | 97.73 | 93.38 | 4.33 | 100128
>      4 | 52.80 | 51.14 | 1.69 | 135204
>     16 | 35.82 | 35.09 | 0.76 | 284124
>    999 | 23.30 | 23.13 | 0.20 | 1903852
>
> The implementation is done with xargs, which should be widely available;
> it's in POSIX, we rely on it already in the test suite. And "coccicheck"
> is really a developer-only tool anyway, so it's not a big deal if
> obscure systems can't run it.
>
> Signed-off-by: Jeff King <peff@xxxxxxxx>
> ---
> I left the default at 1 for safety. Probably 4 or 16 would be an OK
> default, but I don't have any interest in figuring out exactly what
> Travis or some hypothetical average machine can handle. I'll be setting
> mine to 999. ;)
>
> Making "0" work as "unlimited" might be nice, but xargs doesn't support
> that and I didn't want to make the recipe any more unreadable than it
> already is.
>
>  Makefile | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 9f1b6e8926..daba958b8f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1174,8 +1174,10 @@ PTHREAD_CFLAGS =
>  SPARSE_FLAGS ?=
>  SP_EXTRA_FLAGS =
>
> -# For the 'coccicheck' target
> +# For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
> +# usually result in less CPU usage at the cost of higher peak memory.
>  SPATCH_FLAGS = --all-includes --patch .
> +SPATCH_BATCH_SIZE = 1
>
>  include config.mak.uname
>  -include config.mak.autogen
> @@ -2790,12 +2792,9 @@ endif
>
>  %.cocci.patch: %.cocci $(COCCI_SOURCES)
>         @echo '    ' SPATCH $<; \
> -       ret=0; \
> -       for f in $(COCCI_SOURCES); do \
> -               $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \
> -                       { ret=$$?; break; }; \
> -       done >$@+ 2>$@.log; \
> -       if test $$ret != 0; \
> +       if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
> +               $(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
> +               >$@+ 2>$@.log; \
>         then \
>                 cat $@.log; \
>                 exit 1; \
> --
> 2.21.0.1314.g224b191707
>

This looks reasonable to me :)

Thanks,
Jake



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux