Re: [PATCH] builtin/worktree: support relative paths for worktrees

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 20, 2024 at 9:02 PM Francesco Guastella via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
>     builtin/worktree: support relative paths for worktrees
>
>     As of now, when creating or managing worktrees, paths are always stored
>     in absolute form. While this behavior works well in many scenarios, it
>     presents significant challenges when worktrees are accessed across
>     different operating systems or when the repository is moved to a
>     different location in the filesystem. Absolute paths hard-coded in the
>     .git and gitdir files can break these links, causing issues that may
>     require manual intervention to resolve.
>
>     To address this, I have introduced a new configuration option:
>     worktree.useRelativePaths. This option allows users to specify whether
>     they prefer Git to store worktree paths in relative form rather than
>     absolute. The new feature enhances Git’s flexibility, particularly in
>     environments where repositories need to be portable across different
>     systems or where directories are frequently relocated.
>
>     Key Changes: The new worktree.useRelativePaths option can be enabled by
>     the user to store paths in relative form. When enabled, any new
>     worktrees added using the git worktree add command will have their paths
>     stored as relative paths in the necessary git files.
>
>     The git worktree move command has been updated to respect the current
>     value of worktree.useRelativePaths. When a worktree is moved, Git will
>     now automatically adjust the path format (relative or absolute) to match
>     the user's configuration setting.
>
>     The git worktree repair command has been similarly enhanced. It will now
>     automatically convert paths between relative and absolute forms based on
>     the worktree.useRelativePaths setting, making it easier to maintain
>     consistent links across different environments.

Using relative rather than absolute paths has been discussed
previously[1], and the general feeling was that relative paths would
probably be beneficial (despite some obvious downsides mentioned by
Junio in his response[2]). That earlier discussion petered out without
any changes being made partly because some stated and some implied
questions lacked answers.

A significant concern was whether any such change could be made
without breaking existing third-party tooling, non-canonical Git
implementations, and even older versions of Git itself. Your choice of
controlling the behavior via a configuration option somewhat sidesteps
the issue by giving the user the choice of tolerating or not
tolerating such potential breakage.

Sidestepping like that may be the best we can hope for, though it
would also be nice if the choice could be automated and hidden from
the user as an implementation detail. One idea which was floated was
for Git to store both the absolute and relative paths, thus leveraging
the benefits of both path types. But that, too, has the potential
downside of breaking existing tooling and other Git implementations,
so it requires careful consideration. At any rate, it would be good to
hear your thoughts on the idea since it might somehow fit into the
overall scheme laid out by your work, or your work may gain additional
direction by taking that idea into consideration.

Anyhow, below is a set of questions and observations that I jotted
down as I lightly scanned the patch and played around with "git
worktree" after applying the patch. Many of the items may seem to
sound negative, but do not interpret that as rejection of the idea of
using relative paths; as stated above earlier discussion looked upon
relative paths as potentially beneficial.

It's possible that the actual code changes in the patch answer some of
the questions I ask below, but I simply don't have the mental
bandwidth to digest and reason about a 3,500 line patch in order to
figure out the answers myself. As such, any answers and responses you
can provide directly will be appreciated.

Here are my notes...

Since it is likely that third-party tools and non-canonical Git
implementations (and even older versions of Git itself) will break
with relative paths, people have created tools which let them switch
between the two on-demand. This way, they can benefit from relative
paths but still interoperate (albeit in a painfully manual fashion)
with those tools or implementations which would otherwise break. Does
this implementation provide such a tool? At first glance, based upon
your above commentary, it sounded as if "git worktree repair" would
serve a similar purpose after a user manually changes the
worktree.useRelativePaths setting, but that doesn't seem to be the
case in practice.

That leads to the next question: Should there be a dedicated
git-worktree subcommand which both changes the
worktree.useRelativePaths setting and converts all the stored paths to
reflect the new setting?

Your above commentary says that "git worktree repair" converts between
absolute and relative, but it's not clear whether it does this for
*all* worktrees or only worktrees which it fixes. Moreover, does it
convert both <repo>/worktrees/<id>/gitdir and <worktree>/.git in
unison or does it only convert the path in a file it actually repairs?
Based upon testing, I'm guessing it only does the absolute/relative
conversion *if* it actually repairs a file, and *only* converts the
repaired file. As such, the links between <repo> and <worktree> can
end up a mix of absolute and relative.

According to your above commentary, "git worktree repair" converts
between absolute and relative but the documentation for "repair" in
Documentation/git-worktree.txt has not been updated to reflect this
(unlike the documentation for "git worktree move" which now does
conversion and is documented as doing so).

The new worktree.useRelativePaths configuration should be documented
in Documentation/config/worktree.txt.

The "worktrees/<id>/gitdir::" entry in
Documentation/gitrepository-layout.txt says that the contained path is
absolute. This needs to be updated to mention relative paths and
worktree.useRelativePaths.

The documentation should discuss and stress the potential dangers of
using relative paths so that people know what they are getting
themselves into. In particular, it should mention the possibility of
relative paths breaking third-party tooling, non-canonical Git
implementations, and older versions of Git itself.

Sometimes patch authors don't bother documenting newly-added "public"
API functions (possibly because a lot of existing API functions lack
documentation), so it was nice to see this patch being thorough about
documenting new API functions.

In my testing, several "git worktree" subcommands failed or misbehaved
badly when invoked from within a linked worktree. This surprised me
since my light scan over the patch seemed to indicate that you had
taken such cases into consideration, so perhaps I'm doing something
wrong(?). For my tests, I had set up a repository and worktrees like
this:

  % git init a
  % cd a
  % git config set worktree.useRelativePaths true
  % echo content >file
  % git add file
  % git commit file -m msg
  % git worktree add ../b
  % git worktree add c

"git worktree" subcommands are supposed to be agnostic in the sense
that you should be able to invoke them from within any worktree
successfully, however, with your patch applied, this seems to no
longer be the case. For instance:

  % cd c
  % git worktree lock --reason because ../../b
  fatal: '../../b' is not a working tree

Similarly, when run from within a linked worktree, "git worktree
prune" now thinks that *all* worktrees should be pruned:

  # still in "c"
  % git worktree prune -n -v
  Removing worktrees/c: gitdir file points to non-existent location
  Removing worktrees/b: gitdir file points to non-existent location

When run from within a linked worktree, "git worktree list" shows
paths relative to the main worktree (or bare repository):

  # still in "c"
  % git worktree list
  /.../a  935238d [main]
  ../b  935238d [b] prunable
  c  935238d [c] prunable

This makes the direct output useless for any sort of navigation and
especially so in --porcelain mode for scripting purposes. To fix this,
"list" should either show the absolute paths (even if they are
maintained as relative internally) or the paths need to be recomputed
and shown as relative from the *current* worktree, not from the main
worktree.

Assuming I'm not doing something wrong in my testing, then do the
above breakages indicate some gaping holes in the test suite? If so,
then we probably need a bunch of new tests to ensure that the above
behaviors don't break when relative paths are in use.

You clearly put a lot of work and care into this patch, and (as noted
above) the idea of using relative paths has previously been considered
in a reasonably positive light, so it is unlikely that reviewers want
to lightly dismiss your patch, but at 3,500 lines, the patch is
unreviewable. No reviewer is going to have the mental bandwidth to be
able to remember or reason about *every* change made by this
monolithic patch. As the author of the patch, having developed it
incrementally over the course of days or weeks, you have a good
overall understanding of all the changes in the patch and the
evolution of the code from its present state in the project to the
state in your fork, but reviewer time is a limited resource on this
project, and it is almost certainly impossible for any reviewer to be
able to properly digest this all in a single patch.

As such, aside from answering the above questions, the way to move
forward is to split these changes out into many small, independent,
well-isolated pieces, each in its own patch, and to arrange the order
of patches so that they hand-hold and lead reviewers through the
evolution from the current implementation to the end-state. Each patch
should be one step in the journey toward the end-goal, and each patch
should build upon the previous patch, requiring reviewers to remember
only one or two important items from the current patch in order to
understand the subsequent patch. For a topic with so many changes,
this will likely require quite a few patches; it may even make sense
to split it up into several series of patches, with each series
submitted separately after reviewers have had time to digest the
preceding series.

Organizing a patch series like this places a lot of extra work on your
shoulders (beyond the work you already invested making the changes to
the code itself), but providing such hand-holding and baby-steps is
the only way reviewers will be able to grasp all the changes. A
well-organized patch series is also important for future readers of
the project history when they need to understand why the current
implementation is the way it is.

[1]: The discussion begins at the "Also Eric" paragraph of this email
and continues in emails following it:
https://lore.kernel.org/git/CACsJy8CXEKG+WNdSPOWF7JDzPXidSRWZZ5zkdMW3N3Dg8SGW_Q@xxxxxxxxxxxxxx/

[2]: https://lore.kernel.org/git/xmqqikupbxh5.fsf@gitster.g/





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux