On Tue, May 16, 2017 at 10:19 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Stefan Beller <sbeller@xxxxxxxxxx> writes: > >> In a later patch, I want to propose an option to detect&color >> moved lines in a diff, which cannot be done in a one-pass over >> the diff. Instead we need to go over the whole diff twice, >> because we cannot detect the first line of the two corresponding >> lines (+ and -) that got moved. >> >> So to prepare the diff machinery for two pass algorithms >> (i.e. buffer it all up and then operate on the result), >> move all emissions to places, such that the only emitting >> function is emit_line_0. >> >> This prepares the code for submodules to go through the >> emit_line function. >> >> Signed-off-by: Stefan Beller <sbeller@xxxxxxxxxx> >> --- >> diff.c | 20 +++++++--------- >> diff.h | 5 ++++ >> submodule.c | 78 ++++++++++++++++++++++++++++++------------------------------- >> submodule.h | 9 +++---- >> 4 files changed, 56 insertions(+), 56 deletions(-) >> >> diff --git a/diff.c b/diff.c >> index 690794aeb8..7c8d6a5d12 100644 >> --- a/diff.c >> +++ b/diff.c >> @@ -516,8 +516,8 @@ static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2, >> ecbdata->blank_at_eof_in_postimage = (at - l2) + 1; >> } >> >> -static void emit_line(struct diff_options *o, const char *set, const char *reset, >> - int add_line_prefix, int sign, const char *line, int len) >> +void emit_line(struct diff_options *o, const char *set, const char *reset, >> + int add_line_prefix, int sign, const char *line, int len) >> { >> int has_trailing_newline, has_trailing_carriage_return; >> FILE *file = o->file; >> @@ -547,10 +547,10 @@ static void emit_line(struct diff_options *o, const char *set, const char *reset >> fputc('\n', file); >> } >> >> -static void emit_line_fmt(struct diff_options *o, >> - const char *set, const char *reset, >> - int add_line_prefix, >> - const char *fmt, ...) >> +void emit_line_fmt(struct diff_options *o, >> + const char *set, const char *reset, >> + int add_line_prefix, >> + const char *fmt, ...) > > Interesting... > >> -static void show_submodule_header(FILE *f, const char *path, >> - const char *line_prefix, >> +static void show_submodule_header(struct diff_options *o, const char *path, >> struct object_id *one, struct object_id *two, >> unsigned dirty_submodule, const char *meta, >> const char *reset, > > Is this ONLY called when the caller wants its output inserted to the > "diff" (or "log -p") output? Yes. > If so, I think it makes sense to pass > 'o', but if the function is oblivious that it is driven to produce > part of a "diff", it feels wrong to pass 'o'. The original was > taking a "FILE *" and line_prefix, so it is rather clear that the > answer to the question is "yes, this is very closely tied to diff > output". Now you have access to 'o', so you do not need to pass > them separately. Good. ok. > Each line in its output, when incorporated in "diff" or "log -p" > output, must be prefixed with the line-prefix to accomodate users of > "log --graph", so I guess it cannot be helped. Your calls to > emit_line_fmt() below seems to ask the line-prefix to be added, > which is good, too. > > How does capturing these lines help moved line detection, by the > way? These must never be matched with any other added or removed > line in the real patch output. Why? Actually I think it has some value if it can match across (submodule-)repository boundaries, e.g. think of Ævars RFC to put SHA1DC into a submodule. If reviewing that commit later on, a user may be interested in "what is the difference between what we carried so far in this repo compared to what we point at now in the submodule". Most of the code should be the same, but anchored at a different path/repo, so a move detection would be super helpful. I do understand that you may not want to see a move crossing a repo boundary, but I would prefer that to a later patch, once we have a better understanding on the use cases of this new feature. >> if (is_null_oid(one)) >> message = "(new submodule)"; > > emit_line() and emit_line_fmt() are both inappropriate names for a > global function. These are very closely tied to diff generation, so > we probably want to see some form of "diff" in their names. Oh, uh. You're right. I would think inside of diff.c we'd still want to keep the short name, so maybe I'd expose a wrapper to the outside world. Maybe diff_emit_strbuf(diff_options *, strbuf *) would be fine for all use cases from outside. > The fact that it is clear because its first parameter is "struct > diff_options" is insufficient---"you cannot tell what context the > function is meant to be used by only looking at its name" is > certainly solved by its function signature, but the other issue with > an overly generic name is that other codepaths in different contexts > may want to use such a short and sweet name. > > Thanks. I am bad at naming, so if you have a better idea for names, feel free to mention them. Thanks, Stefan