Hello Junio, Keshav Thank you very much for your detailed reviewing. I just tried to update the patch and posted as "[PATCH v6]". --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsuneo@xxxxxxxxx On Oct 16, 2013, at 1:54 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Yoshioka Tsuneo <yoshiokatsuneo@xxxxxxxxx> writes: > >> "git diff -M --stat" can detect rename and show renamed file name like >> "foofoofoo => barbarbar". But if destination filename is long, the line >> is shortened like "...barbarbar" so there is no way to know whether the >> file is renamed or existed in the source commit. > > Is "destination" filename more special than the source filename? > Perhaps "s/if destination filename is/if filenames are/"? > > Note: I do not want you to reroll using the suggested > wording without explanation; it may be possible that I am > missing something obvious and do not understand why you > singled out destination, in which case I'd rather see it > explained better in the log message than the potentially > suboptimal suggestion I made in the review without > understanding the issue. Of course, it is possible that you > want to do the same when source is overlong, in which case > you can just say "Yeah, you're right; will reroll". > > The above applies to all the other comments in this message. > > Also "s/source commit/original/". You may not be comparing two > commits after all. > >> Make sure there is always an arrow, like "...foo => ...bar". >> The output can contains curly braces('{','}') for grouping. > > s/contains/contain/; > >> So, in general, the outpu format is "<pfx>{<mid_a> => <mid_b>}<sfx>" > > s/outpu/&t/; > >> To keep arrow("=>"), try to omit <pfx> as long as possible at first >> because later part or changing part will be the more important part. >> If it is not enough, shorten <mid_a>, <mid_b>, and <sfx> trying to >> have the maximum length the same because those will be equaly important. > > A sound reasoning. > >> Signed-off-by: Tsuneo Yoshioka <yoshiokatsuneo@xxxxxxxxx> >> Test-added-by: Thomas Rast <trast@xxxxxxxxxxx> >> --- >> diff.c | 187 +++++++++++++++++++++++++++++++++++++++++++------ >> t/t4001-diff-rename.sh | 12 ++++ >> 2 files changed, 177 insertions(+), 22 deletions(-) >> >> diff --git a/diff.c b/diff.c >> index a04a34d..cf50807 100644 >> --- a/diff.c >> +++ b/diff.c >> @@ -1258,11 +1258,12 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) >> } >> } >> >> -static char *pprint_rename(const char *a, const char *b) >> +static void pprint_rename_find_common_prefix_suffix(const char *a, const char *b >> + , struct strbuf *pfx, struct strbuf *a_mid >> + , struct strbuf *b_mid, struct strbuf *sfx) > > What kind of line splitting is this? > > I think the real issue is that the function name is overly long, but > aside from that, > > - comma comes at the end of the line, not at the beginning of the > next line; > > - the second and subsequent lines are indented, but not more than > the usual line width (align with the first letter inside the > opening parenthesis of the first line); > > - a_mid and b_mid are more "alike" than pfx and a_mid. > > so I would expect to see it more like: > > static void abbrev_rename(const char *a, const char *b, > struct strbuf *pfx, > struct strbuf *a_mid, struct strbuf *b_mid, > struct strbuf *sfx) > > Note that the suggested name does not say "pprint", because in your > version of this file, the code around here is no longer doing any > printing. The caller does so after using this function to decide > how to abbreviate renames, so naming the helper function after what > it does (e.g. abbreviate renames) is more appropriate. > >> { >> const char *old = a; >> const char *new = b; >> - struct strbuf name = STRBUF_INIT; >> int pfx_length, sfx_length; >> int pfx_adjust_for_slash; >> int len_a = strlen(a); >> @@ -1272,10 +1273,9 @@ static char *pprint_rename(const char *a, const char *b) >> int qlen_b = quote_c_style(b, NULL, NULL, 0); >> >> if (qlen_a || qlen_b) { >> - quote_c_style(a, &name, NULL, 0); >> - strbuf_addstr(&name, " => "); >> - quote_c_style(b, &name, NULL, 0); >> - return strbuf_detach(&name, NULL); >> + quote_c_style(a, a_mid, NULL, 0); >> + quote_c_style(b, b_mid, NULL, 0); >> + return; >> } >> >> /* Find common prefix */ >> @@ -1321,18 +1321,151 @@ static char *pprint_rename(const char *a, const char *b) >> a_midlen = 0; >> if (b_midlen < 0) >> b_midlen = 0; >> + > > Trailing whitespace (there are many others you added to this file; I > won't bother to point out all of them). > >> + strbuf_add(pfx, a, pfx_length); >> + strbuf_add(a_mid, a + pfx_length, a_midlen); >> + strbuf_add(b_mid, b + pfx_length, b_midlen); >> + strbuf_add(sfx, a + len_a - sfx_length, sfx_length); >> +} >> + >> +/* >> + * Omit each parts to fix in name_width. >> + * Formatted string is "<pfx>{<a_mid> => <b_mid>}<sfx>". >> + * At first, omit <pfx> as long as possible. >> + * If it is not enough, omit <a_mid>, <b_mid>, <sfx> by tring to set the length of >> + * those 3 parts(including "...") to the same. >> + * Ex: >> + * "foofoofoo => barbarbar" >> + * will be like >> + * "...foo => ...bar". >> + * "long_parent{foofoofoo => barbarbar}longfilename" >> + * will be like >> + * "...parent{...foofoo => ...barbar}...lename" >> + */ >> +static void pprint_rename_omit(struct strbuf *pfx, struct strbuf *a_mid, struct strbuf *b_mid >> + , struct strbuf *sfx, int name_width) > > Bad line splitting. > >> +{ >> + >> +#define ARROW " => " >> +#define ELLIPSIS "..." > > Ugly and leaks these symbols after the function is done using them > to the remainder of this file. Write them like this instead, perhaps? > > static const char arrow[] = " => "; > static const char dots[] = "..."; > >> +#define swap(a, b) myswap((a), (b), sizeof(a)) >> + >> +#define myswap(a, b, size) do { \ >> +unsigned char mytmp[size]; \ >> +memcpy(mytmp, &a, size); \ >> +memcpy(&a, &b, size); \ >> +memcpy(&b, mytmp, size); \ >> +} while (0) > > These are totally unneeded, I suspect (see below). > >> + >> + int use_curly_braces = (pfx->len > 0) || (sfx->len > 0); >> + size_t name_len; >> + size_t len; >> + size_t part_lengths[4]; > > Do not name an array in plural, i.e. elements[], unless there is a > compelling reason to do so. By using singular, e.g. element[], the > third element can be spelled as element[3], which is more logical > than having to call it elements[3]. > > Side note. a notable exception is an array that is used as a > hash-table and frequently passed around as an argument; you > are usually not interested in iterating over it in ascending > order, and being able to call such a collection of things > "things" in plural, e.g. struct object objects[], is more > important. > >> + size_t max_part_len = 0; >> + size_t remainder_part_len = 0; >> + int i, j; >> + >> + name_len = pfx->len + a_mid->len + b_mid->len + sfx->len + strlen(ARROW) >> + + (use_curly_braces ? 2 : 0); >> + >> + if (name_len <= name_width) { >> + /* Everthing fits in name_width */ >> + return; >> + } >> + >> + if (use_curly_braces) { >> + if (strlen(ELLIPSIS) + (name_len - pfx->len) <= name_width) { >> + /* >> + Just omitting left of '{' is enough >> + Ex: ...aaa{foofoofoo => bar}file >> + */ > > /* > * We format our multi-line > * comments like > * this. > */ > >> + strbuf_splice(pfx, name_len - pfx->len, name_width - (name_len - pfx->len), ELLIPSIS, strlen(ELLIPSIS)); > > Overlong line. > > Is the math for the second and third arguments correct? If you are > making "abcdefghij" into "...hij", you would splice at position 0 > for length up to 'g', so it felt strange to see any arithmetic as > the second argument, but I didn't look at this code very closely. > >> + return; >> + } else { >> + if (pfx->len > strlen(ELLIPSIS)) { >> + /* >> + Just omitting left of '{' is not enough >> + name will be "...{SOMETHING}SOMETHING" >> + */ >> + strbuf_reset(pfx); >> + strbuf_addstr(pfx, ELLIPSIS); >> + } >> + } >> + } >> >> - strbuf_grow(&name, pfx_length + a_midlen + b_midlen + sfx_length + 7); >> - if (pfx_length + sfx_length) { >> - strbuf_add(&name, a, pfx_length); >> + /* available length for a_mid, b_mid and sfx */ >> + len = name_width - strlen(ARROW) - (use_curly_braces ? 2 : 0); >> + >> + /* a_mid, b_mid, sfx will be have the same max, including ellipsis("..."). */ >> + part_lengths[0] = (int)a_mid->len; >> + part_lengths[1] = (int)b_mid->len; >> + part_lengths[2] = (int)sfx->len; > > What are these casts about? strbuf.len is of size_t which is > already the correct type for part_length[]. > >> + >> + /* bubble sort of part_lengths, descending order */ > > Do not bubble sort. Unless there is a compelling reason not to > (liek you are in a performance critical section and want to use a > custom sort algorithm), just let the platform-supplied qsort(3) do > the job by writing a small comparison function. > >> + for (i=0; i<3; i++) { >> + for (j=i+1; j<3; j++) { >> + if (part_lengths[j] > part_lengths[i]) { >> + swap(part_lengths[i], part_lengths[j]); >> + } >> + } >> + } >> + >> + if (part_lengths[1] + part_lengths[1] + part_lengths[2] <= len) { >> + /* >> + * "{...foofoo => barbar}file" >> + * There is only one omitting part. > > s/omitting/omitted/; > >> + */ >> + max_part_len = len - part_lengths[1] - part_lengths[2]; >> + } else if (part_lengths[2] + part_lengths[2] + part_lengths[2] <= len) { >> + /* >> + * "{...foofoo => ...barbar}file" >> + * There are 2 omitting part. > > s/omitting part/omitted parts/; > >> + */ >> + max_part_len = (len - part_lengths[2]) / 2; >> + remainder_part_len = (len - part_lengths[2]) - max_part_len * 2; >> + } else { >> + /* >> + * "{...ofoo => ...rbar}...file" >> + * There are 3 omitting part. > > Likewise. > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html