The previous rounds were at <20180520211210.1248-1-t.gummerer@xxxxxxxxx> and <20180605215219.28783-1-t.gummerer@xxxxxxxxx>. This round is a more polished version of the previous round, as suggested by Junio in <xmqq1scgmemy.fsf@xxxxxxxxxxxxxxxxxxxxxxxxx>. It's also rebased on v2.18. The series grew by one patch, because 8/10 has been split into two patches hopefully making it easier to follow. range-diff before below: 1: 2825342cc2 = 1: 018bd68a8a rerere: unify error messages when read_cache fails 2: d1500028aa ! 2: 281fcbf24f rerere: lowercase error messages @@ -4,7 +4,8 @@ Documentation/CodingGuidelines mentions that error messages should be lowercase. Prior to marking them for translation follow that pattern - in rerere as well. + in rerere as well, so translators won't have to translate messages + that don't conform to our guidelines. Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx> @@ -87,3 +88,12 @@ /* Nuke the recorded resolution for the conflict */ id = new_rerere_id(sha1); +@@ + handle_cache(path, sha1, rerere_path(id, "thisimage")); + if (read_mmfile(&cur, rerere_path(id, "thisimage"))) { + free(cur.ptr); +- error("Failed to update conflicted state in '%s'", path); ++ error("failed to update conflicted state in '%s'", path); + goto fail_exit; + } + cleanly_resolved = !try_merge(id, path, &cur, &result); 3: ed3601ee71 = 3: b6d5e2e26d rerere: wrap paths in output in sq 4: 6ead84a199 ! 4: 45f0d7a99f rerere: mark strings for translation @@ -4,7 +4,7 @@ 'git rerere' is considered a plumbing command and as such its output should be translated. Its functionality is also only enabled through - a config setting, so scripts really shouldn't rely on its output + a config setting, so scripts really shouldn't rely on the output either way. Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx> @@ -219,8 +219,8 @@ handle_cache(path, sha1, rerere_path(id, "thisimage")); if (read_mmfile(&cur, rerere_path(id, "thisimage"))) { free(cur.ptr); -- error("Failed to update conflicted state in '%s'", path); -+ error(_("Failed to update conflicted state in '%s'"), path); +- error("failed to update conflicted state in '%s'", path); ++ error(_("failed to update conflicted state in '%s'"), path); goto fail_exit; } cleanly_resolved = !try_merge(id, path, &cur, &result); 5: caad276aca ! 5: 993857a816 rerere: add some documentation @@ -1,6 +1,6 @@ Author: Thomas Gummerer <t.gummerer@xxxxxxxxx> - rerere: add some documentation + rerere: add documentation for conflict normalization Add some documentation for the logic behind the conflict normalization in rerere. @@ -27,30 +27,28 @@ +when different conflict style settings are used, rerere normalizes the +conflicts before writing them to the rerere database. + -+Differnt conflict styles and branch names are dealt with by stripping -+that information from the conflict markers, and removing extraneous -+information from the `diff3` conflict style. -+ -+Branches being merged in different order are dealt with by sorting the -+conflict hunks. More on each of those parts in the following -+sections. ++Different conflict styles and branch names are normalized by stripping ++the labels from the conflict markers, and removing extraneous ++information from the `diff3` conflict style. Branches that are merged ++in different order are normalized by sorting the conflict hunks. More ++on each of those steps in the following sections. + +Once these two normalization operations are applied, a conflict ID is -+created based on the normalized conflict, which is later used by ++calculated based on the normalized conflict, which is later used by +rerere to look up the conflict in the rerere database. + +Stripping extraneous information +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Say we have three branches AB, AC and AC2. The common ancestor of -+these branches has a file with with a line with the string "A" (for -+brevity this line is called "line A" for brevity in the following) in -+it. In branch AB this line is changed to "B", in AC, this line is -+changed to C, and branch AC2 is forked off of AC, after the line was -+changed to C. ++these branches has a file with a line containing the string "A" (for ++brevity this is called "line A" in the rest of the document). In ++branch AB this line is changed to "B", in AC, this line is changed to ++"C", and branch AC2 is forked off of AC, after the line was changed to ++"C". + -+Now forking a branch ABAC off of branch AB and then merging AC into it, -+we'd get a conflict like the following: ++Forking a branch ABAC off of branch AB and then merging AC into it, we ++get a conflict like the following: + + <<<<<<< HEAD + B @@ -58,9 +56,9 @@ + C + >>>>>>> AC + -+Now doing the analogous with AC2 (forking a branch ABAC2 off of branch -+AB and then merging branch AC2 into it), maybe using the diff3 -+conflict style, we'd get a conflict like the following: ++Doing the analogous with AC2 (forking a branch ABAC2 off of branch AB ++and then merging branch AC2 into it), using the diff3 conflict style, ++we get a conflict like the following: + + <<<<<<< HEAD + B 6: ad88a6b8a8 = 6: a7a0f657f3 rerere: fix crash when conflict goes unresolved 7: 15f9efcba6 = 7: f1afd4b9a4 rerere: only return whether a path has conflicts or not 8: 1490efaad3 ! 8: 1f5cef506a rerere: factor out handle_conflict function @@ -3,53 +3,14 @@ rerere: factor out handle_conflict function Factor out the handle_conflict function, which handles a single - conflict in a path. This is a preparation for the next step, where - this function will be re-used. No functional changes intended. + conflict in a path. This is in preparation for a subsequent commit, + where this function will be re-used. No functional changes intended. Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx> diff --git a/rerere.c b/rerere.c --- a/rerere.c +++ b/rerere.c -@@ - ferr_puts(str, io->output, &io->wrerror); - } - --/* -- * Write a conflict marker to io->output (if defined). -- */ --static void rerere_io_putconflict(int ch, int size, struct rerere_io *io) --{ -- char buf[64]; -- -- while (size) { -- if (size <= sizeof(buf) - 2) { -- memset(buf, ch, size); -- buf[size] = '\n'; -- buf[size + 1] = '\0'; -- size = 0; -- } else { -- int sz = sizeof(buf) - 1; -- -- /* -- * Make sure we will not write everything out -- * in this round by leaving at least 1 byte -- * for the next round, giving the next round -- * a chance to add the terminating LF. Yuck. -- */ -- if (size <= sz) -- sz -= (sz - size) + 1; -- memset(buf, ch, sz); -- buf[sz] = '\0'; -- size -= sz; -- } -- rerere_io_putstr(buf, io); -- } --} -- - static void rerere_io_putmem(const char *mem, size_t sz, struct rerere_io *io) - { - if (io->output) @@ return isspace(*buf); } @@ -67,14 +28,7 @@ - * hunks and -1 if an error occured. - */ -static int handle_path(unsigned char *sha1, struct rerere_io *io, int marker_size) -+static void rerere_strbuf_putconflict(struct strbuf *buf, int ch, size_t size) -+{ -+ strbuf_addchars(buf, ch, size); -+ strbuf_addch(buf, '\n'); -+} -+ -+static int handle_conflict(struct strbuf *out, struct rerere_io *io, -+ int marker_size, git_SHA_CTX *ctx) ++static int handle_conflict(struct rerere_io *io, int marker_size, git_SHA_CTX *ctx) { - git_SHA_CTX ctx; - int has_conflicts = 0; @@ -88,38 +42,39 @@ - - if (sha1) - git_SHA1_Init(&ctx); -- -+ int has_conflicts = 1; ++ int has_conflicts = -1; + while (!io->getline(&buf, io)) { -- if (is_cmarker(buf.buf, '<', marker_size)) { + if (is_cmarker(buf.buf, '<', marker_size)) { - if (hunk != RR_CONTEXT) - goto bad; - hunk = RR_SIDE_1; -- } else if (is_cmarker(buf.buf, '|', marker_size)) { -+ if (is_cmarker(buf.buf, '<', marker_size)) -+ goto bad; -+ else if (is_cmarker(buf.buf, '|', marker_size)) { ++ break; + } else if (is_cmarker(buf.buf, '|', marker_size)) { if (hunk != RR_SIDE_1) - goto bad; +- goto bad; ++ break; hunk = RR_ORIGINAL; -@@ - goto bad; + } else if (is_cmarker(buf.buf, '=', marker_size)) { + if (hunk != RR_SIDE_1 && hunk != RR_ORIGINAL) +- goto bad; ++ break; + hunk = RR_SIDE_2; + } else if (is_cmarker(buf.buf, '>', marker_size)) { + if (hunk != RR_SIDE_2) +- goto bad; ++ break; if (strbuf_cmp(&one, &two) > 0) strbuf_swap(&one, &two); -- has_conflicts = 1; + has_conflicts = 1; - hunk = RR_CONTEXT; -- rerere_io_putconflict('<', marker_size, io); -- rerere_io_putmem(one.buf, one.len, io); -- rerere_io_putconflict('=', marker_size, io); -- rerere_io_putmem(two.buf, two.len, io); -- rerere_io_putconflict('>', marker_size, io); + rerere_io_putconflict('<', marker_size, io); + rerere_io_putmem(one.buf, one.len, io); + rerere_io_putconflict('=', marker_size, io); + rerere_io_putmem(two.buf, two.len, io); + rerere_io_putconflict('>', marker_size, io); - if (sha1) { - git_SHA1_Update(&ctx, one.buf ? one.buf : "", -+ rerere_strbuf_putconflict(out, '<', marker_size); -+ strbuf_addbuf(out, &one); -+ rerere_strbuf_putconflict(out, '=', marker_size); -+ strbuf_addbuf(out, &two); -+ rerere_strbuf_putconflict(out, '>', marker_size); + if (ctx) { + git_SHA1_Update(ctx, one.buf ? one.buf : "", one.len + 1); @@ -129,7 +84,7 @@ } - strbuf_reset(&one); - strbuf_reset(&two); -+ goto out; ++ break; } else if (hunk == RR_SIDE_1) strbuf_addbuf(&one, &buf); else if (hunk == RR_ORIGINAL) @@ -143,9 +98,6 @@ - hunk = 99; /* force error exit */ - break; } -+bad: -+ has_conflicts = -1; -+out: strbuf_release(&one); strbuf_release(&two); strbuf_release(&buf); @@ -169,24 +121,20 @@ +{ + git_SHA_CTX ctx; + struct strbuf buf = STRBUF_INIT; -+ struct strbuf out = STRBUF_INIT; + int has_conflicts = 0; + if (sha1) + git_SHA1_Init(&ctx); + + while (!io->getline(&buf, io)) { + if (is_cmarker(buf.buf, '<', marker_size)) { -+ has_conflicts = handle_conflict(&out, io, marker_size, -+ sha1 ? &ctx : NULL); ++ has_conflicts = handle_conflict(io, marker_size, ++ sha1 ? &ctx : NULL); + if (has_conflicts < 0) + break; -+ rerere_io_putmem(out.buf, out.len, io); -+ strbuf_reset(&out); + } else + rerere_io_putstr(buf.buf, io); + } + strbuf_release(&buf); -+ strbuf_release(&out); + if (sha1) git_SHA1_Final(sha1, &ctx); -: ---------- > 9: 8ac0d3e903 rerere: return strbuf from handle path 9: 6619650c42 ! 10: ef84fdc201 rerere: teach rerere to handle nested conflicts @@ -27,7 +27,7 @@ >>>>>>> branch-2 >>>>>>> branch-3~ - it would be reordered as follows in the preimage: + it would be recorde as follows in the preimage: <<<<<<< 1 @@ -40,14 +40,16 @@ >>>>>>> and the conflict ID would be calculated as + sha1(1<NUL><<<<<<< 2 ======= 3 >>>>>>><NUL>) - Stripping out vs. leaving the conflict markers in place should have no - practical impact, but it simplifies the implementation. + Stripping out vs. leaving the conflict markers in place in the inner + conflict should have no practical impact, but it simplifies the + implementation. Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx> @@ -63,9 +65,11 @@ +~~~~~~~~~~~~~~~~ + +Nested conflicts are handled very similarly to "simple" conflicts. -+Same as before, labels on conflict markers and diff3 output is -+stripped, and the conflict hunks are sorted, for both the outer and -+the inner conflict. ++Similar to simple conflicts, the conflict is first normalized by ++stripping the labels from conflict markers, stripping the diff3 ++output, and the sorting the conflict hunks, both for the outer and the ++inner conflict. This is done recursively, so any number of nested ++conflicts can be handled. + +The only difference is in how the conflict ID is calculated. For the +inner conflict, the conflict markers themselves are not stripped out @@ -83,16 +87,16 @@ + >>>>>>> branch-2 + >>>>>>> branch-3~ + -+After stripping out the labels of the conflict markers, the conflict -+would look as follows: ++After stripping out the labels of the conflict markers, and sorting ++the hunks, the conflict would look as follows: + + <<<<<<< + 1 + ======= + <<<<<<< -+ 3 -+ ======= + 2 ++ ======= ++ 3 + >>>>>>> + >>>>>>> + @@ -108,23 +112,21 @@ struct strbuf one = STRBUF_INIT, two = STRBUF_INIT; - struct strbuf buf = STRBUF_INIT; + struct strbuf buf = STRBUF_INIT, conflict = STRBUF_INIT; - int has_conflicts = 1; + int has_conflicts = -1; + while (!io->getline(&buf, io)) { -- if (is_cmarker(buf.buf, '<', marker_size)) -- goto bad; -- else if (is_cmarker(buf.buf, '|', marker_size)) { -+ if (is_cmarker(buf.buf, '<', marker_size)) { + if (is_cmarker(buf.buf, '<', marker_size)) { +- break; + if (handle_conflict(&conflict, io, marker_size, NULL) < 0) -+ goto bad; ++ break; + if (hunk == RR_SIDE_1) + strbuf_addbuf(&one, &conflict); + else + strbuf_addbuf(&two, &conflict); + strbuf_release(&conflict); -+ } else if (is_cmarker(buf.buf, '|', marker_size)) { + } else if (is_cmarker(buf.buf, '|', marker_size)) { if (hunk != RR_SIDE_1) - goto bad; - hunk = RR_ORIGINAL; + break; diff --git a/t/t4200-rerere.sh b/t/t4200-rerere.sh --- a/t/t4200-rerere.sh @@ -169,6 +171,5 @@ + cat test >actual && + test_cmp expect actual +' -+ + test_done 10: 4b11dce7dd = 11: 35a826908f rerere: recalculate conflict ID when unresolved conflict is committed Thomas Gummerer (11): rerere: unify error messages when read_cache fails rerere: lowercase error messages rerere: wrap paths in output in sq rerere: mark strings for translation rerere: add documentation for conflict normalization rerere: fix crash when conflict goes unresolved rerere: only return whether a path has conflicts or not rerere: factor out handle_conflict function rerere: return strbuf from handle path rerere: teach rerere to handle nested conflicts rerere: recalculate conflict ID when unresolved conflict is committed Documentation/technical/rerere.txt | 182 +++++++++++++++++++++ builtin/rerere.c | 4 +- rerere.c | 243 ++++++++++++++--------------- t/t4200-rerere.sh | 66 ++++++++ 4 files changed, 366 insertions(+), 129 deletions(-) create mode 100644 Documentation/technical/rerere.txt -- 2.18.0.233.g985f88cf7e