On Tue, Dec 28, 2021 at 1:32 PM Derrick Stolee via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > This series is based on a merge of en/sparse-checkout-set, > en/worktree-chatty-to-stderr, and ds/sparse-checkout-malformed-pattern-fix. I think you mean es/worktree-chatty-to-stderr (not 'en/') > This patch series includes a fix to the bug reported by Sean Allred [1] and > diagnosed by Eric Sunshine [2]. > > The root cause is that 'git sparse-checkout init' writes to the worktree > config without checking that core.bare might need to be set. This only > matters when the base repository is bare, since creating the config.worktree > file and enabling extensions.worktreeConfig will cause Git to treat the base > repo's core.bare=false as important for this worktree. > > This series fixes this, but also puts in place some helpers to prevent this > from happening in the future. While here, some of the config paths are > modified to take a repository struct. > > * 'git sparse-checkout' will now modify the worktree config, if enabled. It > will no longer auto-upgrade users into using worktree config. This sounds dangerous to me. > * The new 'git worktree init-worktree-config' will upgrade users to using > worktree config. It will relocate core.bare and core.worktree if > necessary. This sounds like giving users an extra step to unbreak themselves, instead of just having the commands do it. (And might risk also breaking things in a different direction? I'll have to read over the patches to see...) > * 'git worktree add' will copy the sparse-checkout patterns from the > current worktree to the new one. If worktree config is enabled, then the > config settings from the current worktree are copied to the new > worktree's config file. This sounds awesome. Wahoo! (core.worktree should be cleared, though, and core.bare should either be cleared or set to false if found in the worktree config.) ... > Range-diff vs v2: > > 1: 889e69dc45d = 1: 749ba67d21e setup: use a repository when upgrading format > 2: 3e01356815a = 2: 61b96937016 config: make some helpers repo-aware > 3: ed8e2a7b19d ! 3: e2a0a458115 worktree: add upgrade_to_worktree_config() ... > @@ Commit message > Helped-by: Eric Sunshine <sunshine@xxxxxxxxxxxxxx> > Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> > > - ## worktree.c ## > + ## Documentation/git-worktree.txt ## > +@@ Documentation/git-worktree.txt: SYNOPSIS > + -------- > + [verse] > + 'git worktree add' [-f] [--detach] [--checkout] [--lock [--reason <string>]] [-b <new-branch>] <path> [<commit-ish>] > ++'git worktree init-worktree-config' > + 'git worktree list' [-v | --porcelain] > + 'git worktree lock' [--reason <string>] <worktree> > + 'git worktree move' <worktree> <new-path> > +@@ Documentation/git-worktree.txt: checked out in the new working tree, if it's not checked out anywhere > + else, otherwise the command will refuse to create the working tree (unless > + `--force` is used). > + > ++init-worktree-config:: > ++ > ++Initialize config settings to enable worktree-specific config settings. > ++This will set `core.repositoryFormatversion=1` and enable > ++`extensions.worktreeConfig`, which might cause some third-party tools from s/cause/prevent/ ? > ++being able to operate on your repository. See CONFIGURATION FILE for more > ++details. So, if users attempt to use `git worktree add` or `git sparse-checkout {init,set}` without first running this, they can break other worktrees. And if they do run this new command, they potentially break third-party tools or older git versions. Yet, with the very common case (in fact, I'd go so far as to say the nearly universal case) of having both core.bare=false and no core.worktree set, we never had any such problems with the former logic. This seems like a serious regression to me. I'll keep reading. ... > +@@ Documentation/git-worktree.txt: versions will refuse to access repositories with this extension. > + > + Note that in this file, the exception for `core.bare` and `core.worktree` > + is gone. If they exist in `$GIT_DIR/config`, you must move > +-them to the `config.worktree` of the main working tree. You may also > +-take this opportunity to review and move other configuration that you > +-do not want to share to all working trees: > ++them to the `config.worktree` of the main working tree. These keys are > ++moved automatically when you use the `git worktree init-worktree-config` > ++command. > ++ > ++You may also take this opportunity to review and move other configuration > ++that you do not want to share to all working trees: > + > + - `core.worktree` and `core.bare` should never be shared I'm fine with the wording, but technically core.bare=false is fine to share, it's just core.bare=true that is not. (And yes, I agree that core.worktree is never safe to share) ... > 5: 06457fafa78 ! 5: b200819c1bb sparse-checkout: use repo_config_set_worktree_gently() > @@ Commit message > sparse-checkout: use repo_config_set_worktree_gently() > ... > + Let the sparse-checkout builtin use this helper instead of attempting to > + initialize the worktree config on its own. This changes behavior of 'git > + sparse-checkout set' in a few important ways: > > - The fix is to have this assignment into config.worktree be handled by > - the repo_config_set_worktree_gently() helper. > + 1. Git will no longer upgrade the repository format and add the > + worktree config extension. The user should run 'git worktree > + init-worktree-config' to enable this feature. > + > + 2. If worktree config is disabled, then this command will set the > + core.sparseCheckout (and possibly core.sparseCheckoutCone and > + index.sparse) values in the common config file. Yikes. > + 3. If the main worktree is bare, then this command will not put the > + worktree in a broken state. > + > + The main reason to use worktree-specific config for the sparse-checkout > + builtin was to avoid enabling sparse-checkout patterns in one and > + causing a loss of files in another. If a worktree does not have a > + sparse-checkout patterns file, then the sparse-checkout logic will not > + kick in on that worktree. > + > + This new logic introduces a new user pattern that could lead to some > + confusion. Suppose a user has not upgraded to worktree config and > + follows these steps in order: > + > + 1. Enable sparse-checkout in a worktree. > + > + 2. Disable sparse-checkout in that worktree without deleting that > + worktree's sparse-checkout file. > + > + 3. Enable sparse-checkout in another worktree. > + > + After these steps, the first worktree will have sparse-checkout enabled > + with whatever patterns exist. The worktree does not immediately have > + those patterns applied, but a variety of Git commands would apply the > + sparse-checkout patterns and update the worktree state to reflect those > + patterns. This situation is likely very rare and the workaround is to No, it's not even rare, let alone very rare. I'd actually call it common. Since 'sparse-checkout disable' does not delete the sparse-checkout file, and we've encouraged folks to use the sparse-checkout command (or a wrapper thereof) instead of direct editing of the sparse-checkout file (and indeed, the sparse-checkout command will overwrite the sparse-checkout file which further discourages users from feeling they own it), having the file left around after disabling is the common case. So, the only question is, how often do users disable and re-enable sparse-checkout, and potentially only do so in some of their worktrees? At my $DAYJOB, that's actually quite common. I got multiple reports quite soon after introducing our `sparsify` tool about users doing something like this; this is what led me to learn of the extensions.worktreeConfig, and why I pointed it out to you on your first submission of git-sparse-checkout[1] (https://lore.kernel.org/git/CABPp-BFcH5hQqujjmc88L3qGx3QAYZ_chH6PXQXyp13ipfV6hQ@xxxxxxxxxxxxxx/) Some more details about sparse-checkouts at $DAYJOB: * We have numerous users who used to have their own small little repos, and want to still have that experience. sparse-checkouts were a way to provide a small-repo feel for them, and in particular to speed up IDEs that otherwise insist on indexing everything no matter how much we try to tell them to only pay attention to part of the files (and similar speed issues with the build system since gradle configuration is so slow). They want every worktree to be sparse and never face the full repo. I talk about these users the most, because we've mostly satisfied the other users. * We have numerous users who work mostly on specific modules, but occasionally do cross-cutting work. They will have most worktrees be sparse, but keep at least one that is full. They may also undo sparsity briefly and redo it in various worktrees. * We also have a few users who work primarily on cross-cutting features and just keep the full checkout and ignore the sparse-checkout stuff. * There is also a special tool, write-locks.sh or some name like that, which must unsparsify, run something like `./gradle --write-locks` and other stuff, and then re-sparsify. The tool does not work without the full worktree. Now, not all users need to run this, and in fact most users who do probably are the same ones that "occasionally do cross-cutting work", but it's a little bit wider group than that. So we have additional unsparsify/resparsify steps, and it's further complicated by the fact that users may not even know that we do the unsparsification and resparsification behind the scenes for them (our `sparsify` wrapper has a stash-like feature, though I think only allowing for one on the stash, which is called behind the scenes.) The write-locks logic is super slow, which kind of hides the otherwise slow unsparsify/resparsify steps. (For each of the above type of users, we have many folks who don't use worktrees, but they're not relevant to this discussion.) We also have third party git tools, whether jgit (usually via gradle plugins), whatever IDEs tend to use (eclipse used egit, not sure what intellij or visual studio use), probably various special one-off scripts that read a bit more git configuration than they should and do various tasks, and a wide range of git versions in use (though any given user will likely only use one git version, and we are perfectly happy to specify a minimum version for sparse-checkout usage). So that's the range of users for this discussion, but I think we also need to flesh out the caveats of your change since you missed a few: * Even if users haven't sparsified/unsparsified/re-sparsified in another worktree (i.e. they have a worktree that has always been full), the sparse-checkout init/set in another worktree _still_ causes problems because when users switch to the 'full' worktree, contrib/completion/git-prompt.sh will report it as sparse. Users are going to get confused and report bugs just based on their prompt even if it's technically a "dense" worktree. * The cone-mode and sparse-index are also shared between worktrees, which may not present many problems for me right now at $DAYJOB, but could be problematic for other users out there. So, here's the experience I expect from these patches at $DAYJOB: (1) Several users per week hit the case of one worktree being sparsified when it wasn't supposed to be. (2) These users have no idea how to figure out what they need to do to fix it. The init-worktree-config is no more discoverable than the documentation on the official steps for enabling extensions.worktreeConfig (See https://lore.kernel.org/git/CABPp-BGKyDJV9DP+igmCC_Ad0jgvb4aOAYpXWCbx9hW8ShhDQg@xxxxxxxxxxxxxx/ up through the paragraph, "Further, it's not even clear people would look at git-worktree.txt.) (3) Even if they do discover it, and run it, it's an extra step they never needed to take before. Why are we adding a "unbreak these other commands we want to run" step? (4) Also, even if they do discover it, and run it, suddenly we are setting core.repositoryFormatVersion=1. That scares me. I have years of experience at $DAYJOB saying that the tooling we have works fine with extensions.worktreeConfig=true. I have none with setting core.repositoryFormatVersion=1, but now we're effectively requiring it by your documentation. I have no idea how jgit/egit/other-random-stuff interacts with that. I'd be willing to do some tests with targetted users to try to learn more, but suddenly turning it on for everyone in cases that we know worked fine without it previously feels unsafe to me. Maybe I'm over-worrying here, but see also commit 11664196ac ("Revert "check_repository_format_gently(): refuse extensions for old repositories"", 2020-07-15) -- it just feels a bit late to recommend for users, especially when they'll see it as "oh, if you don't want this other bug we recently introduced you need to run this....". So, I'd like to reiterate my earlier suggestion which would avoid these regressions while also fixing the reported bug: * If core.bare=true or core.worktree is set, then at `git worktree add` time, automatically run the logic you have here for init-worktree-config. Having either of those config settings with multiple worktrees is currently broken in all git versions and likely in most all external tools. As such, being aggressive in the new config settings to allow new versions of git to work seems totally safe to me -- we can't be any more broken than we already were. * If core.bare=false and core.worktree is not set, then: * `git sparse-checkout {init,set}` should set extensions.worktreeConfig if not already set, and always set the core.sparse* and index.sparse settings in worktree-specific files. * `git worktree add`, if extensions.worktreeConfig is already set, will copy both the info/sparse-checkout file and the config.worktree settings (module core.bare and core.worktree, if present) to the new worktree