On 8/17/2021 5:32 PM, SZEDER Gábor wrote: > On Tue, Aug 17, 2021 at 02:26:23PM -0400, Derrick Stolee wrote: >> On 8/17/2021 1:49 PM, SZEDER Gábor wrote: >>> The sparse index and split index features are said to be currently >>> incompatible [1], and consequently GIT_TEST_SPLIT_INDEX=1 might >>> interfere with the test cases exercising the sparse index feature. >>> Therefore GIT_TEST_SPLIT_INDEX is already explicitly disabled for the >>> whole of 't1092-sparse-checkout-compatibility.sh'. There are, >>> however, two other test cases exercising sparse index, namely >>> 'sparse-index enabled and disabled' in >>> 't1091-sparse-checkout-builtin.sh' and 'status succeeds with sparse >>> index' in 't7519-status-fsmonitor.sh', and these two could fail with >>> GIT_TEST_SPLIT_INDEX=1 as well [2]. >>> >>> Unset GIT_TEST_SPLIT_INDEX and disable the split index in these two >>> test cases to avoid such interference. >>> >>> Note that this is the minimal change to merely avoid failures when >>> these test cases are run with GIT_TEST_SPLIT_INDEX=1. Interestingly, >>> though, without these changes the 'git sparse-checkout init --cone >>> --sparse-index' commands still succeed even with split index, and set >>> all the necessary configuration variables and create the initial >>> '$GIT_DIR/info/sparse-checkout' file, but the test failures are caused >>> by later sanity checks finding that the index is not in fact a sparse >>> index. This indicates that 'git sparse-checkout init --sparse-index' >>> lacks some error checking and its tests lack coverage related to split >>> index, but fixing those issues (let alone making sparse index >>> comparible with split index) is beyond the scope of this patch series. >> >> s/comparible/compatible. >> >> I agree that making these two things compatible is not something to >> solve today. I'm not sure they should _ever_ be solved because of >> the complexity involved (what if the base index is not sparse but >> the tip wants to be, or vice-versa?, > > I think that this's not an issue, because a shared index file is not > for forever, but it's expected that a new shared index is written > every once in a while anyway, see splitIndex.maxPercentChange. So > whenever sparse index gets enabled/disabled we could just write a new > shared index accordingly. Maybe even when the sparse patterns change, > if necessary. > >> or if a directory must be expanded because of a conflict?). > > I'm not quite up-to-date in sparse index terminology, so I'm not sure > that this means... but as mentioned above we can just write a new > shared index when deemed necessary. > >> They use very different approaches >> to solve a similar problem: how to deal with large index files. >> >> * The split index reduces index _write_ time by only editing a diff >> of the base index. >> >> * The sparse index reduces index _read and write_ time by writing a >> smaller index, but only if the user is using cone mode sparse- >> checkout. > > Yeah, I think that in general there is more to be gained with sparse > index than with split index, though the split index might further > reduce the write time of a sparse index, because the amount of data > written is proportional with the nr of changed files instead of the nr > of all files in the sparse index. I'm not sure that it's worth it, > either. > > My remarks about compatibility primarily stem from your remarks about > compatibility in response to my sparse vs. split test failure report > in: > > https://public-inbox.org/git/48e9c3d6-407a-1843-2d91-22112410e3f8@xxxxxxxxx/ > > "the sparse-index is (currently) incompatible with the split-index" > > I assumed that that "(currently)" implies that eventually the two will > be made compatible. I say "(currently)" because there is nothing stopping someone from integrating them. I'm not sure anyone ever will. > Anyway, I'm fine with leaving them incompatible, but 'git > sparse-checkout' should still learn about split index, so it won't > pretend to succeed after it couldn't do what it was asked to when > there is a split index. Maybe vice-versa as well. You're saying that when 'git sparse-checkout init --cone --sparse-index' doesn't write a sparse index because the index is already split, we should actually die() saying "split index and sparse index are incompatible" so the user knows why the sparse index was not enabled? That seems reasonable for now, although it is successful in setting the config to enable the sparse index opportunistically in the future. >>> test_expect_success 'sparse-index enabled and disabled' ' >>> - git -C repo sparse-checkout init --cone --sparse-index && >>> - test_cmp_config -C repo true index.sparse && >>> - test-tool -C repo read-cache --table >cache && >>> - grep " tree " cache && >>> - >>> - git -C repo sparse-checkout disable && >>> - test-tool -C repo read-cache --table >cache && >>> - ! grep " tree " cache && >>> - git -C repo config --list >config && >>> - ! grep index.sparse config >>> + ( >>> + sane_unset GIT_TEST_SPLIT_INDEX && >>> + git -C repo update-index --no-split-index && >>> + >>> + git -C repo sparse-checkout init --cone --sparse-index && >>> + test_cmp_config -C repo true index.sparse && >>> + test-tool -C repo read-cache --table >cache && >>> + grep " tree " cache && >>> + >>> + git -C repo sparse-checkout disable && >>> + test-tool -C repo read-cache --table >cache && >>> + ! grep " tree " cache && >>> + git -C repo config --list >config && >>> + ! grep index.sparse config >>> + ) >>> ' >> >> This test is safe for now. >> >>> test_expect_success 'status succeeds with sparse index' ' >> >> This test is being edited in ds/sparse-index-ignored-files. v3 >> of the relevant patch was just sent today [1]. >> >> [1] https://lore.kernel.org/git/e66106f7a99d94145eec983ea5e72b7cf8a8a479.1629206603.git.gitgitgadget@xxxxxxxxx/ >> >> You might want to rebase on top of that topic. The edits to >> the test are likely stable now. > > Oh, no :) The test 'cone mode clears ignored subdirectories' added in > that patch series (as in 'seen', so not the newes version) fails with > the working GIT_TEST_SPLIT_INDEX=1 as well, and I don't immediately > see why, as it doesn't seem to use sparse index. Oh interesting! This is because the index converts to a sparse index in-memory solely so it can use the sparse directory entries to pick which directories need to be deleted. In the presence of a split index, convert_to_sparse() fails due to these lines: if (istate->split_index || istate->sparse_index || !istate->cache_nr || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; if (!istate->repo) istate->repo = the_repository; if (!(flags & SPARSE_INDEX_IGNORE_CONFIG)) { ... While this if here is checking the condition for creating it in-memory. The fix is likely to do the following instead: if (istate->sparse_index || !istate->cache_nr || !core_apply_sparse_checkout || !core_sparse_checkout_cone) return 0; if (!istate->repo) istate->repo = the_repository; if (!(flags & SPARSE_INDEX_IGNORE_CONFIG)) { if (istate->split_index) return 0; ... and maybe even rename the flag to SPARSE_INDEX_MEMORY_ONLY or something. Thanks for alerting me to this issue! -Stolee