Stefan Beller <sbeller@xxxxxxxxxx> writes: >>> +static void show_submodule_header(struct diff_options *o, const char *path, >>> struct object_id *one, struct object_id *two, >>> unsigned dirty_submodule, const char *meta, >>> const char *reset, >> ... >> How does capturing these lines help moved line detection, by the >> way? These must never be matched with any other added or removed >> line in the real patch output. > > Why? What are buffered are not patch text, but informational text like "Submodule X contains untracked content", etc. When a text file is modified elsewhere and lost a line that happened to say the same contents, we do not want to consider that such a line was moved to where a submodule had an untracked file. I have a suspicion that the two-pass buffering is done at too high a level in this series. Doesn't the code (I haven't reached the end of the series) update emit_line() to buffer the patch text and these non-patch text with all the coloring and resetting sequences? Because the "ah, this old line removed corresponds to that new line that appears elsewhere?" logic do not want to see these color/reset sequence, the buffering code needs to become quite specific to how the current diff code is colored (e.g. a line must be painted in a single color and have reset at the end) and makes future change to color things differently almost impossible (e.g. imagine how you would add a "feature" that paints certain words on added lines differently?). Ahh, yes, I see NEEDSWORK comment in patch 19/20. Yes, I agree that this code really should be working in terms of offsets into pre/post images when finding matching changes, which probably should happen without letting fn_out_consume produce fully colored textual diff output in the first pass. For the purpose of "moved lines detection", the logic to match a stretch of preimage lines with postimage lines do not want to bother with "--- a/$path" headers, and it does not want to care if a line that begins with '+' needs to be added by calling emit_add_line() that knows how to check ws errors or the payload needs to be painted in green. After the first pass determines which added lines are not true addition but merely moved, the second pass would know how that '+' line needs to be painted a lot better (e.g. it may not be painted in green). Letting fn_out_consume() call emit_add_line() only to compute information (e.g. "'+'? ok, green") that the first pass does not want to see and the second pass will compute better is probably not a good longer term direction to go in. Having said that, we need to start somewhere, and I think it is a reasonable first-cut attempt to work on top of the textual output like this series does (IOW, while I do agree with the NEEDSWORK and the way this series currently does things must be revamped in the longer term, I do not think we should wait until that happens to start playing with this topic). Thanks.