On Wed, May 31, 2023 at 3:00 PM Tao Klerks <tao@xxxxxxxxxx> wrote: > I've recently encountered some users who've got themselves into > trouble by copying a worktree into another folder, to create a second > worktree, instead of using "git worktree add". > > They assumed this would be fine, just the same as creating a new > *repository* by copying an new *repository* is generally fine (except > when you introduce worktrees, of course). > > There users eventually understood that something was wrong, as you > might expect, when a checkout in one worktree also "checked out" (and > produced lots of unexpected changes) in the other worktree - or > similar weirdnesses, eg on commit. > > Ideally, as I user, I would expect git to start shouting at me to "git > worktree repair" the moment the mutual reference between the ".git" > file of the worktree, and the "gitdir" file of the repo's worktree > metadata, stopped agreeing. > > Is there any reason we can't/don't have such a safety check? (would it > be expensive??) > > I think I can implement something reasonably effective with a pair of > hooks (pre-commit and post-checkout look like good candidates), but > I'm weirded out that something like this should need to be "custom". > > I did a quick search of the archives but didn't find anything > relevant. Has this been discussed? Should I try to prepare a patch > including that sort of validation... somewhere? Just some thoughts, not a proper answer... If you want to detect this sort of problem early, then patching one of the initialization functions which is called by each Git command which requires a working tree (i.e. those with NEED_WORK_TREE set or which call setup_work_tree() manually) would be one approach. It's not immediately obvious which function should be patched since there is deep and mysterious interaction between the various startup functions, but probably something in setup.c or environment.c. An alternative might be to perform the detection only when the "index" is about to be updated or some other worktree-local administrative entry, but that seems like a much less tractable approach. Aside from the corruption you describe above, there may be other types of corruption worth detecting early, but any such detection may be prohibitively expensive, even if it's just the type you mentioned. (By "prohibitively", I'm referring to previous work to reduce the startup time of Git commands; people may not want to see those gains reversed, especially for a case such as this which is likely to be rare.) Unless I'm mistaken, the described corruption can only be detected by a Git command running within the duplicate directory itself since there will be no entry in .git/worktrees/ corresponding to the manually-created copy. Hence, Git commands run in any other worktree will be unable to detect it, not even the "git worktree" command. The validation performed by worktree.c:validate_worktree() is already capable of detecting the sort of corruption you describe, if handed a `struct worktree` populated to point at the worktree which is a copy of the original worktree. However, there is no API presently to create a `struct worktree` from an arbitrary path, so that is something you'd have to add if you want to take advantage of worktree.c:validate_worktree(). Adding such a function does feel somewhat ugly and special-purpose, though. I was going to mention that suggesting to the user merely to use "git worktree repair" would not help in this situation, but I see in your followup email that you instruct the user to first delete one of the worktrees before running "git worktree repair", which would indeed work. However, you probably want the instructions to be clearer, saying that the user should delete the worktree directory manually (say, with "rm -rf"), _not_ with "git worktree remove". It wouldn't help with early-detection, but it might be worthwhile to add a "git worktree validate" command which detects and reports problems with worktree administrative data (i.e. call worktree.c:validate_worktree() for each worktree). However, the above caveat still applies about inability to detect the problem from other worktrees. Upgrading "git worktree repair" to report the problem might be worthwhile, but it also is subject to the same caveat about inability to detect the problem from other worktrees. One might suggest that "git worktree repair" should somehow repair the situation itself, perhaps by creating a new entry in .git/worktrees/, but it's not at all clear if that is a good idea. Any such automated solution would need to be very careful not to throw away the user's work (for instance, work which is staged in the "index"). It may be possible to add some option or subcommand to "git worktree" which would allow a user to "attach" an existing directory as a worktree. For instance: % cp -R worktreeA worktreeB % git worktree attach worktreeB would create the necessary administrative entries in .git/worktrees/worktreeB/ and make worktreeB/.git point at .git/worktrees/worktreeB. However, I'm extremely reluctant to suggest doing this since we don't want to encourage users to perform worktree manipulation manually; they should be using "git worktree" for all such tasks. (We also don't want users hijacking a worktree which belongs to some other repository.)