On Tue, Jul 3, 2018 at 7:27 AM Johannes Schindelin via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > At this stage, `git range-diff` can determine corresponding commits > of two related commit ranges. This makes use of the recently introduced > implementation of the Hungarian algorithm. Did you want s/Hungarian/Jonker-Volgenant/ here? (Not worth a re-roll.) > The core of this patch is a straight port of the ideas of tbdiff, the > apparently dormant project at https://github.com/trast/tbdiff. > [...] > Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx> > --- > diff --git a/builtin/range-diff.c b/builtin/range-diff.c > @@ -17,9 +18,49 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix) > + int res = 0; > + struct strbuf range1 = STRBUF_INIT, range2 = STRBUF_INIT; > > - argc = parse_options(argc, argv, NULL, options, > - builtin_range_diff_usage, 0); > + argc = parse_options(argc, argv, NULL, options, builtin_range_diff_usage, > + 0); This parse_options() change appears to be merely a re-wrapping of the line between patches 2 and 3. > - return 0; > + if (argc == 2) { > + if (!strstr(argv[0], "..")) > + warning(_("no .. in range: '%s'"), argv[0]); > + strbuf_addstr(&range1, argv[0]); > + > + if (!strstr(argv[1], "..")) > + warning(_("no .. in range: '%s'"), argv[1]); > + strbuf_addstr(&range2, argv[1]); Should these die() (like the "..." case below) rather than warning()? Warning and continuing doesn't seem like intended behavior. When I test this with on git.git and omit the "..", git sits for a long, long time consuming the CPU. I guess it's git-log'ing pretty much the entire history. % GIT_TRACE=1 git range-diff v1 v2 warning: no .. in range: 'v1' warning: no .. in range: 'v2' trace: git log --no-color -p --no-merges --reverse \ --date-order --decorate=no --no-abbrev-commit v1 ^C % > + } else if (argc == 3) { > + strbuf_addf(&range1, "%s..%s", argv[0], argv[1]); > + strbuf_addf(&range2, "%s..%s", argv[0], argv[2]); > + } else if (argc == 1) { > + const char *b = strstr(argv[0], "..."), *a = argv[0]; > + int a_len; > + > + if (!b) > + die(_("single arg format requires a symmetric range")); > diff --git a/range-diff.c b/range-diff.c > @@ -0,0 +1,307 @@ > +static int read_patches(const char *range, struct string_list *list) > +{ > + while (strbuf_getline(&line, in) != EOF) { > + if (skip_prefix(line.buf, "commit ", &p)) { > + [...] > + in_header = 1; > + continue; > + } > + if (starts_with(line.buf, "diff --git")) { > + in_header = 0; > + [...] > + } else if (in_header) { > + if (starts_with(line.buf, "Author: ")) { > + [...] > + } else if (starts_with(line.buf, " ")) { > + [...] > + } > + continue; > + } else if (starts_with(line.buf, "@@ ")) > + strbuf_addstr(&buf, "@@"); > + else if (line.buf[0] && !starts_with(line.buf, "index ")) > + /* > + * A completely blank (not ' \n', which is context) > + * line is not valid in a diff. We skip it > + * silently, because this neatly handles the blank > + * separator line between commits in git-log > + * output. > + */ > + strbuf_addbuf(&buf, &line); This comment had me confused for a bit since it doesn't seem to agree with the 'then' part of the 'if', but rather applies more to the 'else'. Had it been split into two parts (one for 'then' and one for 'else'), it might have been easier to digest. That is, something like: else if (line.buf[0] && !starts_with(..., "index ")) /* A line we wish to keep. */ strbuf_addbuf(...); else /* * A completely blank line between commits or * or one in which we are otherwise not interested. */ continue; or something. Structuring it a bit differently might have helped, as well: else if (!line.buf[0]) /* A completely blank line between commits. */ continue; else if (starts_with(..., "index ")) /* A line in which we are not interested. */ continue; else strbuf_addbuf(&buf, &line); Not at all worth a re-roll. > + else > + continue; > + if (util) > + string_list_append(list, buf.buf)->util = util; So, the parser is grabbing each commit and shoving all the "interesting" information about the commit in a 'patch_util'. It grabs the OID, author, the commit message (indented), the "diff --git", "+++", "---" lines (but ignores "index" line), "@@" lines (but ignoring the gunk after "@@"), and all context and patch lines. Looks good. > + strbuf_release(&buf); > + > + if (finish_command(&cp)) > + return -1; > + > + return 0; > +}