On Fri, Mar 05, 2021 at 06:07:22PM +0100, Ævar Arnfjörð Bjarmason wrote: > Using --no-includes also fixes a subtle bug introduced in > 960154b9c17 (coccicheck: optionally batch spatch invocations, > 2019-05-06) with duplicate hunks being written to the > generated *.patch files. > > This is because that change altered a file-at-a-time for-loop to an > invocation of "xargs -n X". This would not matter for most other > programs, but it matters for spatch. > > This is because each spatch invocation will maintain shared lock files > in /tmp, check if files being parsed were changed etc. I haven't dug > into why exactly, but it's easy to reproduce the issue[2]. The issue > goes away entirely if we just use --no-includes, which as noted above > would have made sense even without that issue. This part still doesn't make any sense to me. If we are running with SPATCH_BATCH_SIZE=1, which is the default, then "xargs -n" is still going to run it in file-at-a-time mode. From spatch's perspective, there's no difference between a for-loop and "xargs -n 1" (unless it somehow cares about stdin, but it shouldn't). Using strace, I do see it creating files in /tmp, but they are all named after the process id, and cleaned up before exit. So I don't see how they could interfere with each other (certainly not in a sequential run, but even if you were to use "xargs -P" to get parallel runs, they seem distinct). If we increase the batch size, I'd expect _fewer_ duplicates. Because in file-at-a-time mode with --all-includes, wouldn't every file that mentions an include possibly end up emitting a patch for it? The results you show here (which do replicate for me) imply something much weirder is going on: > with xargs -n 1: > 1 +++ b/convert.c > 1 +++ b/strbuf.c > with xargs -n 2: > 1 +++ b/convert.c > 1 +++ b/strbuf.c > with xargs -n 4: > 1 +++ b/convert.c > 1 +++ b/strbuf.c These results are wrong! They are not finding the entry in strbuf.h that should be changed. > with xargs -n 16: > 1 +++ b/convert.c > 1 +++ b/strbuf.c > 2 +++ b/strbuf.h > with xargs -n 64: > 1 +++ b/convert.c > 1 +++ b/strbuf.c > 2 +++ b/strbuf.h > with xargs -n 128: > 1 +++ b/convert.c > 1 +++ b/strbuf.c > 2 +++ b/strbuf.h These ones are also wrong. Now we find the strbuf.h mention, but we are finding it twice. > with xargs -n 512: > 1 +++ b/convert.c > 1 +++ b/strbuf.c > 1 +++ b/strbuf.h And this one, which is given all of the paths in one invocation gets it right. I'd expect that over the "128" version, which is running two independent spatch invocations. But the fact that "64" and "16" produce exactly two duplicates makes little sense. And even less that 1, 2, and 4 don't find the header change at all. Running the same test with a for loop produces the same (wrong) results as "-n 1", as expected: $ for i in *.c; do spatch --sp-file test.cocci --all-includes --patch . $i 2>/dev/null done | grep -F +++ | sort | uniq -c 1 +++ b/convert.c 1 +++ b/strbuf.c So in short I have no idea what spatch is doing. But I remain unconvinced that there is anything wrong with the batch-size patch. Running it like your patch will, feeding the header directly along with --no-includes, does find the correct result: $ for i in *.c *.h; do spatch --sp-file test.cocci --no-includes --patch . $i 2>/dev/null done | grep -F +++ | sort | uniq -c 1 +++ b/convert.c 1 +++ b/strbuf.c 1 +++ b/strbuf.h though I am still concerned by René's example which is missing more results. -Peff