"Victoria Dye via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Victoria Dye <vdye@xxxxxxxxxx> > > Use the index.sparse config setting to expand or collapse the index when > read. Previously, index.sparse would determine how the index would be > written to disk, but would not enforce whether the index is read into memory > as full or sparse. Now, the index is expanded when a sparse index is read > with `index.sparse=false` and is collapsed to sparse when a full index is > read with `index.sparse=true` (and the command does not require a full > index). Instead of calling both in-core index and on-disk index, perhaps use "the in-core index" as appropriately in the above description and the result would become much less ambigous. My knee-jerk reaction was that it is of dubious value to spend cycles to make the in-core index sparse after reading a non-sparse index from the disk to give to the caller, but this hurts only the commands that are not yet sparse-aware and call ensure_full_index() as the first thing they do. To them, we are wasting cycles to shrink and expand for no good reason, and after they are done, the final writeout would create a sparse on-disk index. Besides, the on-disk index is expected to be sparse most of the time when index.sparse is true, so it is hopefully a one-time waste that corrects by itself. For all commands that are sparse-aware, especially when asked to perform their operation on the paths that are not hidden by a tree-like index entry, it may or may not be a win, but the downside would be much smaller. The cost to shrink a full in-core index before writing out as a sparse one should be comparable to the cost to shrink a full in-core index just read from the disk before the sparse-index-aware caller works on it and leaves a still mostly sparse in-core index to be written out without much extra work to re-shrink it to the disk. > This makes the behavior of `index.sparse` more intuitive, as it now clearly > enables/disables usage of a sparse index. It is a minor thing, so I am willing to let it pass, but I am not sure about this claim. The write-out codepath ensures, independent of this change, that a full on-disk index is corrected to become sparse when the configuration is set to true, and a sparse on-disk index is corrected to become full when the configuration is set to false, no? So the only "intuitive"-ness we may be gaining is that the codepaths that are sparse-aware would work in their "sparse" (non-"sparse") mode when index.sparse is set to true (false), respectively, no matter how sparse (or not sparse) the on-disk index they work on is initially. That might help debuggability (assuming that converting between the full and sparse forms are working correctly), but I am not sure if that is something end users would even care about. > - if (istate->repo->settings.command_requires_full_index) > + if (!istate->repo->settings.sparse_index || > + istate->repo->settings.command_requires_full_index) > ensure_full_index(istate); > + else if (!istate->sparse_index) > + convert_to_sparse(istate, 0); > > return istate->cache_nr; Quite straight-forward. Looking good.