On 12/30/2020 10:10 AM, Elijah Newren wrote: > On Wed, Dec 30, 2020 at 6:16 AM Derrick Stolee <stolee@xxxxxxxxx> wrote: >> On 12/18/2020 12:51 AM, Elijah Newren via GitGitGadget wrote: >>> From: Elijah Newren <newren@xxxxxxxxx> >>> >>> Implement unique_path(), based on the one from merge-recursive.c. It is >>> simplified, however, due to: (1) using strmaps, and (2) the fact that >>> merge-ort lets the checkout codepath handle possible collisions with the >>> working tree means that other code locations don't have to. >>> >>> Signed-off-by: Elijah Newren <newren@xxxxxxxxx> >>> --- >>> merge-ort.c | 25 ++++++++++++++++++++++++- >>> 1 file changed, 24 insertions(+), 1 deletion(-) >>> >>> diff --git a/merge-ort.c b/merge-ort.c >>> index d300a02810e..1adc27a11bc 100644 >>> --- a/merge-ort.c >>> +++ b/merge-ort.c >>> @@ -343,11 +343,34 @@ static void path_msg(struct merge_options *opt, >>> strbuf_addch(sb, '\n'); >>> } >>> >>> +/* add a string to a strbuf, but converting "/" to "_" */ >>> +static void add_flattened_path(struct strbuf *out, const char *s) >>> +{ >>> + size_t i = out->len; >>> + strbuf_addstr(out, s); >>> + for (; i < out->len; i++) >>> + if (out->buf[i] == '/') >>> + out->buf[i] = '_'; >>> +} >>> + >> >> Thank you for pointing out that you based your code on merge-recursive.c. >> I see that this implementation is identical to the one there. I question >> whether this causes collisions in a problematic way, when "a/b/c" and >> "a_b_c" both exist in a tree. >> >> To avoid such a problem, we'd likely need to also expand "_" to "__" or >> similar. This might never actually affect any users because of the >> strange filename matches _and_ we need to be in a directory/file conflict. >> >> And maybe it's not a hole at all? If it is, we can consider patching or >> at least documenting the problem. > > add_flattened_path() can certainly result in a collision, regardless > of whether the char *s parameter has any '/' characters in it. For > example, if you are trying to get a unique path associated with > builtin/commit-graph.c due to changes from the 'next' branch side of > the merge, and builtin/commit-graph.c~next already exists, then you > have a collision. It's actually pretty rare that the parameter would > have any '/' characters at all, since it's pretty rare for me to see > folks (other than Junio) use hierarchical branch names. But if the > branch name were ds/line-log-on-bloom, then the provisional filename > would be builtin/commit-graph.c~ds_line-log-on-bloom. The '/' to '_' > conversion exists just to make sure our new file remains in the same > directory as where the conflict that caused us to need a new unique > path occurred. Ah, I am misinterpreting which '/' characters are getting squashed. Thank you for fixing my confusion. > But unique_path() does NOT end immediately after calling > add_flattened_path() and there is no collision possible in the return > from unique_path(), because it ends with a > while (strmap_contains(existing_paths, newpath.buf)) { > loop that modifies the resulting path until it finds one that doesn't > collide with an existing path. And this extra care here is helpful. Thanks! -Stolee