On Wed, 11 Dec 2007, Junio C Hamano wrote: > Jakub Narebski <jnareb@xxxxxxxxx> writes: >> Junio C Hamano wrote: >>> Jakub Narebski <jnareb@xxxxxxxxx> writes: >>> >>>> Previous version of this patch (from 7 May 2007) used instead of current >>>> only "to_name" format similar to git-diff-tree raw format for renames: >>>> >>>> added deleted TAB path for "src" TAB path for "dst" LF > > That's perfectly fine (that is without -z). > >>>> ... Previous version >>>> of patch used (or was to use actually, because of error in the code) >>>> >>>> added deleted TAB path for "src" NUL NUL path for "dst" NUL >>>> >>>> when -z option was used. > > I think your "previous" one: > > <added> <deleted> <src> NUL (no rename) > <added> <deleted> <src> NUL NUL <dst> NUL (with rename) > > would be unambiguously parsable, but it would be cleaner and easier for > the parser if the latter were like this instead: > > <added> <deleted> NUL <src> NUL <dst> NUL (with rename) > > The reader can expect to find a NUL terminated src, finds the length is > zero and notices it cannot be src and then reads two paths from that > point. Very good idea. > Without having the status letter, we cannot do "optional two paths" > without breaking existing --numstat -z readers that do not care about > renames and copies, and I agree that existing --numstat -z readers that > pass -M or -C are already broken, so I think a format extension along > the above line (your "previous" and the above clean-up proposal have the > same power of expressiveness, I just think the latter is syntactically > cleaner) would be reasonable. And at that point we probably would not > need --numstat-enhanced at all ;-) The previous patch is a fix (usability fix) for numstat output in the presence of rename detection, making it truly machine friendly. Moreover it should not break any scripts which parsed numstat output not expecting to deal with renames (for example if repository has diff.renames config option set), and might even fix them. The proposed - and implemented in patch below - change could break some scripts not expecting to deal with new numstat output. Adding status letter (with similarity/dissimilarity percentage info) would unfortunately require greater surgery... We can either allow scripts (do there exist any?) to break, or make the full rename info shown only for --numstat-extended; but I'd like to have this status letter for --numstat-extended / --numstat-enhanced. So possible solution are: (a) Accept both patches, and accept remote possibility that some scripts might break (b) Add --numstat-extended option whoich would show rename info as implemented in this patch (and of course accept previous) (c) Implement --numstat-extended with status letter, as suggested, accept first patch but not this one. Note that gitweb would require --numstat with proper rename detection support if we want to e.g. provide graphical diffstat in 'commit' view for merge commits. P.S. The numstat output format for renames should be probably described in documentation, and not only in commit message, but I was not sure where to put it (and even how it should be written). -- >8 -- From: Jakub Narebski <jnareb@xxxxxxxxx> Date: Tue, 11 Dec 2007 23:52:18 +0100 Subject: [RFC/PATCH] diff: Show rename info in numstat output, correctly Make numstat output of git-diff show both source and destination filename for renames (and copies) in easily parseable format, similar to raw diff output format. The numstat format for rename (or copy) is now <added> <deleted> TAB <path for "src"> TAB <path for "dst"> LF or if -z option is used <added> <deleted> TAB NULL <path for "src"> NULL <path for "dst"> NULL where <added> and <deleted> is number of added/deleted lines (or '-' for binary file). When -z option is not used, TAB, LF, and backslash characters in pathnames are represented as \t, \n, and \\, respectively. Idea for -z numstat rename format by Junio C Hamano. Signed-off-by: Jakub Narebski <jnareb@xxxxxxxxx> --- diff.c | 9 ++++++++- 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/diff.c b/diff.c index 8039ac7..56cbf28 100644 --- a/diff.c +++ b/diff.c @@ -991,7 +991,14 @@ static void show_numstat(struct diffstat_t* data, struct diff_options *options) printf("-\t-\t"); else printf("%d\t%d\t", file->added, file->deleted); - write_name_quoted(file->name, stdout, options->line_termination); + if (file->is_renamed) { + char sep = options->line_termination ? '\t' : '\0'; + if (!options->line_termination) + putchar(options->line_termination); + write_name_quoted(file->from_name, stdout, sep); + write_name_quoted(file->name, stdout, options->line_termination); + } else + write_name_quoted(file->name, stdout, options->line_termination); } } -- 1.5.3.7 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html