On 10/27/2021 5:32 PM, Junio C Hamano wrote: > Derrick Stolee <stolee@xxxxxxxxx> writes: > >>> +int convert_to_sparse(struct index_state *istate, int flags) >>> +{ >>> + /* >>> + * If the index is already sparse, empty, or otherwise >>> + * cannot be converted to sparse, do not convert. >>> + */ >>> + if (istate->sparse_index || !istate->cache_nr || >>> + !is_sparse_index_allowed(istate, flags)) >>> + return 0; > > Shouldn't we also at least do this? Blindly blowing away the entire > cache-tree and rebuilding it from scratch may be hiding a latent bug > somewhere else, but is never supposed to be needed, and is a huge > waste of computational resources. > > I say "at least" here, because a cache tree that is partially valid > should be safely salvageable---at least that was the intention back > when I designed the subsystem. I think you are right, what you propose below. It certainly seems like it would work, and even speed up the conversion from full to sparse. I think I erred on the side of extreme caution and used a hope that converting to sparse would be rare. > sparse-index.c | 24 +++++++++++++----------- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git c/sparse-index.c w/sparse-index.c > index bc3ee358c6..a95c3386f3 100644 > --- c/sparse-index.c > +++ w/sparse-index.c > @@ -188,17 +188,19 @@ int convert_to_sparse(struct index_state *istate, int flags) > if (index_has_unmerged_entries(istate)) > return 0; > > - /* Clear and recompute the cache-tree */ > - cache_tree_free(&istate->cache_tree); > - /* > - * Silently return if there is a problem with the cache tree update, > - * which might just be due to a conflict state in some entry. > - * > - * This might create new tree objects, so be sure to use > - * WRITE_TREE_MISSING_OK. > - */ > - if (cache_tree_update(istate, WRITE_TREE_MISSING_OK)) > - return 0; > + if (!cache_tree_fully_valid(&istate->cache_tree)) { > + /* Clear and recompute the cache-tree */ > + cache_tree_free(&istate->cache_tree); > + /* > + * Silently return if there is a problem with the cache tree update, > + * which might just be due to a conflict state in some entry. > + * > + * This might create new tree objects, so be sure to use > + * WRITE_TREE_MISSING_OK. > + */ > + if (cache_tree_update(istate, WRITE_TREE_MISSING_OK)) > + return 0; > + } I think at this point we have enough tests that check the sparse index and its different conversion points that the test suite might catch if this is a bad idea. Note that this is only a change of behavior if the cache-tree is valid, which I expect to be the case most of the time in the tests. Thanks, -Stolee