On Thu, Jan 24, 2019 at 4:32 AM Jeff King <peff@xxxxxxxx> wrote: > > There's an odd bug when "log --color-moved" is used with the combination > of "--cc --stat -p": the stat for merge commits is erroneously shown > with the diff of the _next_ commit. > > The included test demonstrates the issue. Our history looks something > like this: > > A-B-M--D > \ / > C > > When we run "git log --cc --stat -p --color-moved" starting at D, we get > this sequence of events: > > 1. The diff for D is using -p, so diff_flush() calls into > diff_flush_patch_all_file_pairs(). There we see that o->color_moved > is in effect, so we point o->emitted_symbols to a static local > struct, causing diff_flush_patch() to queue the symbols instead of > actually writing them out. > > We then do our move detection, emit the symbols, and clear the > struct. But we leave o->emitted_symbols pointing to our struct. > > 2. Next we compute the diff for M. This is a merge, so we use the > combined diff code. In find_paths_generic(), we compute the > pairwise diff between each commit and its parent. Normally this is > done with DIFF_FORMAT_NO_OUTPUT, since we're just looking for > intersecting paths. But since "--stat --cc" shows the first-parent > stat, and since we're computing that diff anyway, we enable > DIFF_FORMAT_DIFFSTAT for the first parent. This outputs the stat > information immediately, saving us from running a separate > first-parent diff later. > > But where does that output go? Normally it goes directly to stdout, > but because o->emitted_symbols is set, we queue it. As a result, we > don't actually print the diffstat for the merge commit (yet), Thanks for your analysis. As always a pleasant read. I understand and agree with what is written up to here remembering the code vaguely. > which > is wrong. I disagree with this sentiment. If we remember to flush the queued output this is merely an inefficiency due to implementation details, but not wrong. We could argue that it is wrong to have o->emitted_symbols set, as we know we don't need it for producing a diffstat only. > > 3. Next we compute the diff for C. We're actually showing a patch > again, so we end up in diff_flush_patch_all_file_pairs(), but this > time we have the queued stat from step 2 waiting in our struct. Right, that is how the queueing can produce errors. I wonder if the test that is included in this patch would work on top of e6e045f803 ("diff.c: buffer all output if asked to", 2017-06-29) as that commit specifically wanted to make sure these errors would be caught. > > We add new elements to it for C's diff, and then flush the whole > thing. And we see the diffstat from M as part of C's diff, which is > wrong. > > So triggering the bug really does require the combination of all of > those options. Similarly we can trigger bugs by using options that enable buffering (so far only --color-moved) and options that do not fully buffer and flush. > > To fix it, we can simply restore o->emitted_symbols to NULL after > flushing it, so that it does not affect anything outside of > diff_flush_patch_all_file_pairs(). This intuitively makes sense, since > nobody outside of that function is going to bother flushing it, so we > would not want them to write to it either. This would also cause the inefficiency I mentioned after (2) to disappear, as the merge commits diffstat would be just printed to stdout? > > In fact, we could take this a step further and turn the local "esm" > struct into a non-static variable that goes away after the function > ends. However, since it contains a dynamically sized array, we benefit > from amortizing the cost of allocations over many calls. So we'll leave > it as static to retain that benefit. okay. > > But let's push the zero-ing of esm.nr into the conditional for "if > (o->emitted_symbols)" to make it clear that we do not expect esm to hold > any values if we did not just try to use it. With the code as it is > written now, if we did encounter such a case (which I think would be a > bug), we'd silently leak those values without even bothering to display > them. With this change, we'd at least eventually show them, and somebody > would notice. Wow. Good call. > > Signed-off-by: Jeff King <peff@xxxxxxxx> Reviewed-by: Stefan Beller <sbeller@xxxxxxxxxx>