Re: [PATCH v2 3/3] merge-ort: fix issue with dual rename and add/add conflict

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 30, 2022 at 3:41 AM Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
>
> On Thu, Jun 30 2022, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@xxxxxxxxx>
> >
> > There is code in both merge-recursive and merge-ort for avoiding doubly
> > transitive renames (i.e. one side renames directory A/ -> B/, and the
> > other side renames directory B/ -> C/), because this combination would
> > otherwise make a mess for new files added to A/ on the first side and
> > wondering which directory they end up in -- especially if there were
> > even more renames such as the first side renaming C/ -> D/.  In such
> > cases, it just turns "off" directory rename detection for the higher
> > order transitive cases.
> >
> > The testcases added in t6423 a couple commits ago are slightly different
> > but similar in principle.  They involve a similar case of paired
> > renaming but instead of A/ -> B/ and B/ -> C/, the second side renames
> > a leading directory of B/ to C/.  And both sides add a new file
> > somewhere under the directory that the other side will rename.  While
> > the new files added start within different directories and thus could
> > logically end up within different directories, it is weird for a file
> > on one side to end up where the other one started and not move along
> > with it.  So, let's just turn off directory rename detection in this
> > case as well.
> >
> > Another way to look at this is that if the source name involved in a
> > directory rename on one side is the target name of a directory rename
> > operation for a file from the other side, then we avoid the doubly
> > transitive rename.  (More concretely, if a directory rename on side D
> > wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D
> > already had a file named NEW_NAME, and a directory rename on side E
> > wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the
> > directory rename detection for NEW_NAME to prevent the
> > NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add
> > conflict on NEW_NAME.)
> >
> > Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
> > ---
> >  merge-ort.c                         | 7 +++++++
> >  t/t6423-merge-rename-directories.sh | 4 ++--
> >  2 files changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/merge-ort.c b/merge-ort.c
> > index fa6667de18c..17db4c30e5b 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -2292,9 +2292,16 @@ static char *check_for_directory_rename(struct merge_options *opt,
> >       struct strmap_entry *rename_info;
> >       struct strmap_entry *otherinfo = NULL;
> >       const char *new_dir;
> > +     int other_side = 3 - side_index;
> >
> > +     /*
> > +      * Cases where we don't have or don't want a directory rename for
> > +      * this path, so we return NULL.
> > +      */
> >       if (strmap_empty(dir_renames))
> >               return new_path;
> > +     if (strmap_get(&collisions[other_side], path))
> > +             return new_path;
>
> I realize from looking at merge-recursive.c that this is carrying
> forward some legacy debt

...which was introduced by me as well...

> , but I find this code and the need for a
> comment more complex than it needs to. On top of master this will work
> just as well:
>
>         diff --git a/merge-ort.c b/merge-ort.c
>         index b5015b9afd4..f5a02b1ff6f 100644
>         --- a/merge-ort.c
>         +++ b/merge-ort.c
>         @@ -2268,16 +2268,16 @@ static char *check_for_directory_rename(struct merge_options *opt,
>                                                 struct strmap *collisions,
>                                                 int *clean_merge)
>          {
>         -       char *new_path = NULL;
>         +       char *new_path;
>                 struct strmap_entry *rename_info;
>                 struct strmap_entry *otherinfo = NULL;
>                 const char *new_dir;
>
>                 if (strmap_empty(dir_renames))
>         -               return new_path;
>         +               return NULL;
>                 rename_info = check_dir_renamed(path, dir_renames);
>                 if (!rename_info)
>         -               return new_path;
>         +               return NULL;
>                 /* old_dir = rename_info->key; */
>                 new_dir = rename_info->value;

You know, while making this patch series I was asking myself, "Why
didn't I just explicitly return NULL here?"  I don't know why I did it
this way, but it now seems slightly odd to me too.  I decided to not
fix it up and just provide a more targeted fix, but since it tripped
you up, I'm happy to add this as a preparatory cleanup.

> I.e. we're really just making the reader squint to see that we're
> actually returning NULL here, it's only later that we have an actual
> "new path".
>
> Wouldn't sticking that earlier in this series be an improvement? The
> reason you need to explain "so we return NULL" is because we're carrying
> forward this oddity, but if we just don't initialize it and return NULL
> instead...

I still think the comment makes sense to add, other than the "so we
return NULL" bit, even after this change.  But yeah, we can strike the
"so we return NULL" part of it after this cleanup.

> If you want to keep this pretty much as-is I think you should add a \n
> before the (not seen in your context) "old_dir" comment seen in the
> context here above, to make it visually clear that your new comment is
> referring to these chains of returns. That could also be made clearer
> with (again, on top of master, and could be combined with the above):
>
>         diff --git a/merge-ort.c b/merge-ort.c
>         index b5015b9afd4..a418f81a3eb 100644
>         --- a/merge-ort.c
>         +++ b/merge-ort.c
>         @@ -2278,8 +2278,6 @@ static char *check_for_directory_rename(struct merge_options *opt,
>                 rename_info = check_dir_renamed(path, dir_renames);
>                 if (!rename_info)
>                         return new_path;
>         -       /* old_dir = rename_info->key; */
>         -       new_dir = rename_info->value;
>
>                 /*
>                  * This next part is a little weird.  We do not want to do an
>         @@ -2305,6 +2303,7 @@ static char *check_for_directory_rename(struct merge_options *opt,
>                  * As it turns out, this also prevents N-way transient rename
>                  * confusion; See testcases 9c and 9d of t6043.
>                  */
>         +       new_dir = rename_info->value; /* old_dir = rename_info->key; */
>                 otherinfo = strmap_get_entry(dir_rename_exclusions, new_dir);
>                 if (otherinfo) {
>                         path_msg(opt, rename_info->key, 1,

Sure, seems fine.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux