Stefan Beller <sbeller@xxxxxxxxxx> writes: >> * On 2/2, doing it at xdiff.c level may be limiting this good idea >> to flourish to its full potential, as the interface is fed only >> one diff_filepair at a time. > > I realized that after I implemented it. I agree we would want to have > it function cross file. > > So from my current understanding of the code, > * diffcore_std would call a new function diffcore_detect_moved(void) > just before diffcore_apply_filter is called. > * The new function diffcore_detect_moved would then check if the > diff is a valid textual diff (i.e. real files, not submodules, but > deletion/creation of one file is allowed) > If so we generate the diff internally and as in 2/2 would > hash all added/removed lines with context and store it. I do not think you should step outside diff_flush(). Only when producing textual diff, you would have to run the textual diff twice by going over the q twice: * The first pass would run diff_flush_patch(), which would call into xdiff the usual way, but the callback from xdiff would capture the removed lines and the added lines without making any output. * The second pass would run diff_flush_patch(), but the callback from xdiff would be called with additional information, namely, the removed and the added lines captured in the first pass. * I suspect that the fn_out_consume() function that is used for a normal case (i.e. when we are not doing this more expensive "moved to/moved from" coloring) can be used for the second pass above (i.e. the "priv" aka "ecbdata" may need to be extended so that it can tell which mode of operation it is asked to perform), but if there is not enough similarity between the second pass of this "moved from/moved to" mode and the normal mode of output, it is also OK to have two different callback functions, i.e. the original one to be used in the normal mode, the second one that knows the "these are moved without modification" coloring. The callback for the first pass is sufficiently different and I think it is better to invent a new callback function to be used in the first pass, instead of reusing fn_out_consume(). The fn_out_consume() function working in the "second pass of moved from/moved to mode" would inspect line[] and see if it is an added or a removed line, and then: - if it is an added line, and it appears as a removed line elsewhere in the patchset (you obtained the information in the first pass), you show it as "this was moved from elsewhere". - if it is a removed line, and it appears as an added line elsewhere in the patchset (you obtained the information in the first pass), you show it as "this was moved to elsewhere". Or something like that.