Re: [PATCH v4 4/4] sparse-index: update do_read_index to ensure correct sparsity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Elijah Newren wrote:
> On Fri, Oct 29, 2021 at 6:56 AM Victoria Dye via GitGitGadget
> <gitgitgadget@xxxxxxxxx> wrote:
>>
>> From: Victoria Dye <vdye@xxxxxxxxxx>
>>
>> If `command_requires_full_index` is false, ensure correct in-core index
>> sparsity on read by calling `ensure_correct_sparsity`. This change is meant
>> to update the how the index is read in a command after sparse index-related
> 
> s/update the how/update how/ ?
> 

This was probably the result of mixing up "update the way" and "update how".
I'll go with the latter.

>> repository settings are modified. Previously, for example, if `index.sparse`
>> were changed from `true` to `false`, the in-core index on the next command
>> would be sparse. The index would only be expanded to full when it was next
>> written to disk.
>>
>> By adding a call to `ensure_correct_sparsity`, the in-core index now matches
>> the sparsity dictated by the relevant repository settings as soon as it is
>> read into memory, rather than when it is later written to disk.
> 
> I split up reading this series across different days, and when I got
> here, my first reaction was "Okay, but why would you want that?
> Sounds like extra work for no gain."  I went and looked up the cover
> letter and saw you mentioned that this "introduces the ability to
> enable/disable the sparse index on a command-by-command basis (e.g.,
> allowing a user to troubleshoot a sparse-aware command with '-c
> index.sparse=false' [1]).  That seems like a good reason to me, and
> sounds like it belongs in this commit message.  But it sounds like you
> had other reasons in mind.  If so, could you share them; I'm having
> difficulty understanding how this would make a difference other than
> in the troubleshooting scenario.
> 

The troubleshooting scenario was what inspired this change in the first
place, but it applies any time someone changes repository settings outside
of `git sparse-checkout init`. One relevant scenario I can imagine is if a
user enables `index.sparse, commands would not use the sparse index until a
command that *writes* the index is executed:

$ git config index.sparse true
$ git diff      # read full index, full in-core, no write
$ git add .     # read full index, full in-core, write sparse
$ git diff      # read sparse index, sparse in-core, no write

Another scenario might be enabling `index.sparse`, then running a command
that turns out to be surprisingly slow because it's operating on a full
index in-core (i.e., much slower than the convert_to_sparse + command with
sparse index would be). 

These are definitely edge cases - they rely on manual config changes, and
they're only really noticeable 1) if the config change is immediately
followed by index read-only commands and 2) if the first index-writing
command after the config change is slow. That said, on the off chance that a
user (or future developer) encounter one of these scenarios, I think it'll
be helpful to have `index.sparse` take effect "immediately" after the config
change.

I'll include these (and the troubleshooting) examples in the updated commit
message.

>>
>> Helped-by: Junio C Hamano <gitster@xxxxxxxxx>
>> Co-authored-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx>
>> ---
>>  read-cache.c                             |  8 ++++++
>>  t/t1092-sparse-checkout-compatibility.sh | 31 ++++++++++++++++++++++++
>>  2 files changed, 39 insertions(+)
>>
>> diff --git a/read-cache.c b/read-cache.c
>> index a78b88a41bf..b3772ba70a1 100644
>> --- a/read-cache.c
>> +++ b/read-cache.c
>> @@ -2337,9 +2337,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
>>
>>         if (!istate->repo)
>>                 istate->repo = the_repository;
>> +
>> +       /*
>> +        * If the command explicitly requires a full index, force it
>> +        * to be full. Otherwise, correct the sparsity based on repository
>> +        * settings and other properties of the index (if necessary).
>> +        */
>>         prepare_repo_settings(istate->repo);
>>         if (istate->repo->settings.command_requires_full_index)
>>                 ensure_full_index(istate);
>> +       else
>> +               ensure_correct_sparsity(istate);
>>
>>         return istate->cache_nr;
>>
>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>> index ca91c6a67f8..59accde1fa3 100755
>> --- a/t/t1092-sparse-checkout-compatibility.sh
>> +++ b/t/t1092-sparse-checkout-compatibility.sh
>> @@ -694,6 +694,37 @@ test_expect_success 'sparse-index is expanded and converted back' '
>>         test_region index ensure_full_index trace2.txt
>>  '
>>
>> +test_expect_success 'index.sparse disabled inline uses full index' '
>> +       init_repos &&
>> +
>> +       # When index.sparse is disabled inline with `git status`, the
>> +       # index is expanded at the beginning of the execution then never
>> +       # converted back to sparse. It is then written to disk as a full index.
>> +       rm -f trace2.txt &&
>> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
>> +               git -C sparse-index -c index.sparse=false status &&
>> +       ! test_region index convert_to_sparse trace2.txt &&
>> +       test_region index ensure_full_index trace2.txt &&
>> +
>> +       # Since index.sparse is set to true at a repo level, the index
>> +       # is converted from full to sparse when read, then never expanded
>> +       # over the course of `git status`. It is written to disk as a sparse
>> +       # index.
>> +       rm -f trace2.txt &&
>> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
>> +               git -C sparse-index status &&
>> +       test_region index convert_to_sparse trace2.txt &&
>> +       ! test_region index ensure_full_index trace2.txt &&
>> +
>> +       # Now that the index has been written to disk as sparse, it is not
>> +       # converted to sparse (or expanded to full) when read by `git status`.
>> +       rm -f trace2.txt &&
>> +       GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
>> +               git -C sparse-index status &&
>> +       ! test_region index convert_to_sparse trace2.txt &&
>> +       ! test_region index ensure_full_index trace2.txt
>> +'
>> +
>>  ensure_not_expanded () {
>>         rm -f trace2.txt &&
>>         echo >>sparse-index/untracked.txt &&
>> --
>> gitgitgadget
> 
> The rest of the patch looks fine.
> 




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux