Re: [PATCH 0/2] Sparse index: fetch, pull, ls-files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/17/2021 4:29 AM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> 
>> This is based on ld/sparse-index-blame (merged with 'master' due to an
>> unrelated build issue).
>>
>> Here are two relatively-simple patches that further the sparse index
>> integrations.
>>
>> Did you know that 'fetch' and 'pull' read the index? I didn't, or this would
>> have been an integration much earlier in the cycle. They read the index to
>> look for the .gitmodules file in case there are submodules that need to be
>> fetched. Since looking for a file by name is already protected, we only need
>> to disable 'command_requires_full_index' and we are done.
> 
> This reminds me of one thing.  Our strategy so far has been:
> 
>  - start with blindly calling ensure_full();
> 
>  - audit the code and adjust each code path that needs to walk to
> 
>    . keep walking the full index, but narrow the scope of
>      ensure_full_index() if we can; or
> 
>    . stop assuming we need to walk the full index, but teach it to
>      "skip" the tree-in-index that we are not interested in.
> 
>  - declare victory and drop the blind call to ensure_full().

This is a fair summary of the approach.

> But what makes sure, after all of the above happens, that no new
> changes that assume it can walk the full index enters in the
> codebase?
> 
> In other words, after "fetch" is declared "sparse clean" with patch
> [1/2], what effort from us will it take to stay clean?

The tests in t1092 that use the "ensure_not_expanded" helper are
intended to be regression tests that would start failing if the
sparse index starts expanding in a new way. I think this is what
you mean by staying "sparse clean".

The rest of the tests are attempting to verify that the behavior
is correct when the sparse index is enabled, and that is hopefully
a more obvious situation when something goes wrong. We've tried to
create a robust set of tests here, but I'm sure we will add new
ones as more users start using it. (We will soon have a large set
of real users of the sparse index, and that should be informative.)

Do you see a test gap that would be prudent to address?

One direction I could see is that as new features are contributed
that change how the index is used, these features are not
automatically tested with sparse-checkout and the sparse index.
In this case, we will need to increase our awareness when reviewing
such features to ensure that they could fit within the sparse index
model (or are sufficiently protected by ensure_full_index() in their
first version).

I spent some time thinking about how we could add a
GIT_TEST_SPARSE_INDEX mode to the test suite that would automatically
test the sparse index in the existing tests. However, the feature is
only enabled when in cone-mode sparse-checkout, so we would need to
modify the test repos in a way that some sparse directory entries
exist but also don't collide with the test expectations. I never
found a way to get around that issue, which is why t1092 is such a
big test script.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux