Hi Junio, On Fri, 21 Jan 2022, Junio C Hamano wrote: > "Johannes Schindelin via GitGitGadget" <gitgitgadget@xxxxxxxxx> > writes: > > > From: Johannes Schindelin <johannes.schindelin@xxxxxx> > > > > In 6e773527b6b (sparse-index: convert from full to sparse, 2021-03-30), > > we introduced initial support for a sparse index, and were careful to > > avoid converting to a sparse index in the presence of a split index. > > > > However, when we _just_ read a freshly-initialized index, it might not > > contain a split index even if _writing_ it will add one by virtue of > > being asked for via the `GIT_TEST_SPLIT_INDEX` variable. > > > > We did not notice any problems with checking _only_ for `split_index` > > (and not `GIT_TEST_SPLIT_INDEX`) right until both > > `vd/sparse-sparsity-fix-on-read` _and_ `vd/sparse-reset` were merged. > > > > Those two topics' interplay triggers a bug in conjunction with running > > t1091.15 when `GIT_TEST_SPLIT_INDEX=true` in the following way: > > `vd/sparse-sparsity-fix-on-read` ensures that the index is made sparse > > right after reading, and `vd/sparse-reset` ensures that the index is > > made non-sparse again unless running in the `--soft` mode. Since the > > split index feature is incompatible with the sparse index feature, we > > see a symptom like this: > > > > fatal: position for replacement 4 exceeds base index size 4 > > > > Let's fix this by avoiding the conversion to a sparse index when > > `GIT_TEST_SPLIT_INDEX=true`. > > Does [2/3] allow you to sidestep that issue entirely by skipping > 1091 altogether? You are right that reverting this patch after applying the next patch _still_ works around the issue. That's because currently only t1091 exposes the bug where `GIT_TEST_SPLIT_INDEX` writes a split index that was initially non-split (because it was just created). My intention here was to avoid any problems in case _another_ test script triggers the same condition. After all, we do have an entire CI job that forces `GIT_TEST_SPLIT_INDEX=1`, and in microsoft/git we enable sparse index by default. The entire test suite is therefore surface for the bug to show. > There are 4 hits of "if (istate->split_index" in the codebase, and > this patch makes me wonder why it is suffice to patch only one of > them. I tried to catch only those locations where we run the danger of trying to make a split index also sparse, or a sparse index also split. > I also wondered why we test both split_index and environment > separately, instead of splitting the index very early when the > environment variable is set, so that the rest of the runtime does > not have to worry about the environment, but is the reason why such > an approach was not taken was because GIT_TEST_SPLIT_INDEX can later > allow the index to be splitted, even if istate->split_index is still > NULL right now when this function is called? Indeed. A freshly created index will never be split, even if `GIT_TEST_SPLIT_INDEX` is set. Only when we _write_ this to disk will it be turned into a split index. At this point, we might have mistakenly converted the index into a sparse one, though, and this here patch wants to prevent that from happening. > If that is the reason, it leads to another question. Even if we > ignore GIT_TEST_SPLIT_INDEX and concentrate only on real workload, > if the in-core index can be NULL when this function is called and > then later can become split, there still must be somebody who > notices that sparse index is unallowed when (or after) such a split > happens, no? If there is no such code, then this patch would not be > the whole fix and there needs more change to do so, no? Indeed. Hence 3/3 which tries specifically to prevent the user from turning an already-sparse index into a split index. You will note that 3/3 calls die() instead BUG() in such a case because it would constitute a pilot error, not a bug in Git's code. Ciao, Dscho