Derrick Stolee <derrickstolee@xxxxxxxxxx> 于2022年9月28日周三 00:36写道: > > > +Some of these users also arrive at this usecase from wanting to use > > +partial clones together with sparse checkouts and do disconnected > > +development. Not only do these users generally not care about other > > +parts of the repository, but consider it a blocker for Git commands to > > +try to operate on those. If commands attempt to access paths in history > > +outside the sparsity specification, then the partial clone will attempt > > +to download additional blobs on demand, fail, and then fail the user's > > +command. (This may be unavoidable in some cases, e.g. when `git merge` > > +has non-trivial changes to reconcile outside the sparsity path, but we > > +should limit how often users are forced to connect to the network.) > > This idea pairs well with a feature I've been meaning to build: > 'git sparse-checkout backfill' would download all historical blobs > within the sparse-checkout definition. This is possible with rev-list, > but I want to investigate grouping blobs by path and making requests in > batches, hopefully allowing better deltification and ability to recover > from network disconnections. That makes this idea of "staying within > your sparse-checkout means no missing object downloads" even more likely. > I think this is very useful: if I use sparse-checkout + partial-clone, plugins like git blame in vscode (or other IDE) will be invalidated, or require a lot of network overhead to download the missing blobs, so this git sparse-checkout backfill looks like a promising solution to that problem. > > +People might also end up wanting behavior B due to complex inter-project > > +dependencies. The initial attempts to use sparse-checkouts usually > > +involve the directories you are directly interested in plus what those > > +directories depend upon within your repository. But there's a monkey > > +wrench here: if you have integration tests, they invert the hierarchy: > > +to run integration tests, you need not only what you are interested in > > +and its dependencies, you also need everything that depends upon what > > +you are interested in or that depends upon one of your > > +dependencies...AND you need all the dependencies of that expanded group. > > +That can easily change your sparse-checkout into a nearly dense one. > > In my experience, the downstream dependencies are checked via builds in > the cloud, though that doesn't help if they are source dependencies and > you make a breaking change to an API interface. This kind of problem is > absolutely one of system architecture and I don't know what Git can do > other than to acknowledge it and recommend good patterns. > > In a properly-organized project, 95% of engineers in the project can have > a small sparse-checkout, then 5% work on the common core that has these > downstream dependencies and require a large sparse-checkout definition. > There's nothing Git can do to help those engineers that do cross-tree > work. > This feels like it's because your project code is stable enough, but at other companies I think many of the project dependencies are subject to frequent changes. > > + * `git mv` has similar surprises when moving into or out of the cone, so > > + best to restrict and throw warnings if restriction might affect the result. > > + > > + There may be a difference in here between behavior A and behavior B. > > + For behavior A, we probably only want to warn if there were no > > + suitable matches for files in the sparsity specification, whereas > > + for behavior B, we may want to warn even if there are valid files to > > + operate on if the result would have been different under > > + `--no-restrict`. > > I think in behavior B, users who actually want to modify things tree-wide will > actually increase their sparse-checkout definition to include those files so > they can validate what they are doing. > Agree. > > +=== Implementation Questions === > > + > > + * Does the name --[no-]restrict sound good to others? Are there better options? > > + * Names in use, or appearing in patches, or previously suggested: > > + * --sparse/--dense > > + * --ignore-skip-worktree-bits > > + * --ignore-skip-worktree-entries > > + * --ignore-sparsity > > + * --[no-]restrict-to-sparse-paths > > + * --full-tree/--sparse-tree > > + * --[no-]restrict > > I like the simplicity of --[no-]restrict, and my only worry is that it > doesn't immediately link to what it is restricting. > > Perhaps something like "scope" would describe the set of things we care > about, but use a text mode: > > --scope=sparse (--restrict) > --scope=all (--no-restrict) > > But I'm notoriously bad at naming things. > > > + * Should --[no-]restrict be a git global option, or added as options to each > > + relevant command? (Does that make sense given the multitude of different > > + default behaviors we have for different options?) > > If we can make it a global option, that would be great, then update > the commands to behave under that mode as we go. > > If that doesn't work, then adding the consistent option across commands > would be helpful. It might be good to make a OPT_RESTRICT macro (much > like OPT__VERBOSE, OPT__QUIET, and similar macros. > > > + * Should --sparse in ls-files be made an alias for --restrict? > > + `--restrict` is certainly a near synonym in cone-mode, but even then > > + it's not quite the same. In non-cone mode, ls-files' `--sparse` > > + option has no effect, and in cone-mode it still shows the sparse > > + directory entries which are technically outside the sparsity > > + specification. > > We should definitely replace the --sparse option(s) with whatever we > choose here. For ls-files, we have the issue that we are reporting > what is in the index, and in non-cone-mode the index cannot be sparse. > > Now, maybe we change what the ls-files mode does under --restrict and > only have it report the paths within the sparse-checkout and not even > show the results for sparse directory entries. The --no-restrict would > then expand a sparse-index to show only paths again. > > > + Namely, if folks are not already in a sparse checkout, then require > > + `sparse-checkout init/set` to take a `--[no-]restrict` flag (which > > + would set core.restrictToSparse according to the setting given), and > > + throw an error if the flag is not provided? That error would be a > > + great place to warn folks that the default may change in the future, > > + and get them used to specifying what they want so that the eventual > > + default switch is seamless for them. > > I don't like using the same option name (--[no-]restrict) for something > that sets a config option to keep that behavior permanently. Different > names that make it clearer could be: > > --enable-restrict-mode > --set-scope=(sparse|all) > The name sounds clear enough. I had a idea that add some configuration like: scope.<cmd>.mode=sparse|all and then let scalar help users set some default configs... > > + * clone: should we provide some mechanism for tying partial clones and > > + sparse checkouts together better. Maybe an option > > + --sparse=dir1,dir2,...,dirN > > + which: > > + * Does initial fetch with `--filter=blob:none` > > + * Does the `sparse-checkout set --cone dir1 dir2 ... dirN` thing > > + * Runs a `git rev-list --objects --all -- dir1 dir2 ... dirN` to > > + fault in the missing blobs within the sparse > > + specification...except that rev-list needs some kind of options > > + to also get files from leading directories too. > > + * Sets --restrict mode to allow focusing on the cone of interest > > + (and to permit disconnected development) > > As mentioned, I think we should have the option to backfill the blobs in > the sparse-checkout definition, but 'git clone' should not do this by > default. It's something that can be launched in the background, maybe, but > not a blocking operation on being able to use the repository. > > 'scalar clone' is an excellent testing bed for these kinds of things, > like setting the --restrict mode by default. > This sounds interesting and would like to see scalar support them! > Hopefully my responses aren't too far off-base. I'll go read the rest of > the discussion now that I've contributed my thoughts on the doc. > > Thanks, > -Stolee Thanks, -- ZheNing Hu