Re: [PATCH 15/27] [RFC-VERSION] *: ensure full index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/1/2021 3:22 PM, Elijah Newren wrote:
> On Mon, Jan 25, 2021 at 9:42 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@xxxxxxxxx> wrote:
>>
>> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>>
>> This giant patch is not intended for actual review. I have a branch that
>> has these changes split out in a sane way with some commentary in each
>> file that is modified.
>>
>> The idea here is to guard certain portions of the codebase that do not
>> know how to handle sparse indexes by ensuring that the index is expanded
>> to a full index before proceeding with the logic.
>>
>> This also provides a good mechanism for testing which code needs
>> updating to enable the sparse index in a Git builtin. The builtin can
>> set the_repository->settings.command_requires_full_index to zero and
>> then we can debug the command with a breakpoint on ensure_full_index().
>> That identifies the portion of code that needs adjusting before enabling
>> sparse indexes for that command.
>>
>> Some index operations must be changed to operate on a non-const pointer,
>> since ensuring a full index will modify the index itself.
>>
>> There are likely some gaps to these protections, which is why it will be
>> important to carefully test each scenario as we relax the requirements.
>> I expect that to be a long effort.
> 
> I think the idea makes sense; it provides a way for us to
> incrementally build support for this new feature.
> 
> I skimmed over the code and noticed various interesting places that
> had the ensure_full_index() call (e.g.
> read_skip_worktree_file_from_index() -- whose existence comes from
> sparsity; what irony...).  Better breakouts would be great, so I'll
> defer commenting much until then.  But, just to verify I'm
> understanding: the primary defence is the command_requires_full_index
> setting, and you have added several ensure_full_index() calls
> throughout the code in places you believe would need to be fixed up in
> case someone switches the command_requires_full_index setting.  Is
> that correct?  And your comment on the gaps is just that there may be
> other places that are missing the secondary protection (as opposed to
> my first reading of that paragraph as suggesting we aren't sure if we
> have enough protections yet and need to add more before this moves out
> of RFC); is that right?

Yes, the idea is that we can incrementally enable
command_requires_full_index for some builtins and be confident that
corner cases will be protected by ensure_full_index(). Further, we
can test whether ensure_full_index() was called using test_region
in test scripts to demonstrate that a command is truly "sparse aware"
or if it is converting to full and back to sparse.

There is also the case that when we write the index into a sparse
format, the in-memory structure is modified. If the index is re-used
afterwards, then we must expand to full again for these code paths.

unpack_trees() already has one of these calls because it was necessary
for the sparse-index write to work.

The ensure_full_index() pattern also works when updating a builtin to
work with the sparse-index because of the breakpoint trick.

When I submit this as a full series, this patch will be one full
patch series submission with careful comments about why each of these
is added on a file-by-file basis.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux